专利摘要:
Approach to signal remodeling The present invention relates to statistical values which are calculated based on received source images. An adaptive reshaping function is selected for one or more source images based on one or more statistical values. a portion of source video content is adaptively reshaped, based on the selected adaptive remodeling function, to generate a portion of refurbished video content. The source video content portion is represented by one or more source images. an approximation of an inverse of the adaptive remodeling function is selected. the refurbished video content and a set of adaptive reshaping parameters that define the inverse approximation of the selected adaptive reshaping function are encoded in a refurbished video signal. The refurbished video signal may be processed by a downstream receiver device to generate a reconstructed source image version, for example for display on a display device.
公开号:BR112017018552B1
申请号:R112017018552
申请日:2016-03-17
公开日:2019-10-22
发明作者:Su Guan-Ming;Chou Hsuan-Ting;Kamballur Kottayil Navaneeth;Wang Qiuwei
申请人:Dolby Laboratories Licensing Corp;
IPC主号:
专利说明:

APPROACH FOR SIGNAL REMODELING
REMISSIVE REFERENCE TO RELATED APPLICATIONS [001] The present invention claims the benefit of US provisional patent application No. 62 / 136,402, filed on March 20, 2015, and US provisional patent application No. 62 / 199,391, filed on 31 July 2015, which are incorporated herein by reference in their entirety.
TECHNOLOGY [002] The present invention relates, in general, to image processing and, in particular, to the encoding, decoding and representation of video data.
BACKGROUND [003] Video data, as provided by upstream devices to downstream devices, can support a variety of dynamic ranges, color spaces, etc. Dynamic ranges can vary from brightness levels of 10,000 candelas per square meter (10,000 nits), 12,000 candelas per square meter (12,000 nits), or even more in a high standard, to brightness levels of 100 candelas per square meter (100 nits), 300 candelas per square meter (300 nits), 600 candelas per square meter (600 nits), etc., in a low standard. Color spaces can include, without limitation, linear color spaces, non-linear color spaces, perceptually quantized color spaces, etc.
[004] In addition, metadata related to operational parameters used for encoding video data by upstream devices may be necessary for the upstream devices to generate operational parameters used for decoding video signals generated by the upstream devices. The amount of metadata that the downstream devices would need for relatively high quality encoding operations could be too large to be transmitted to and / or processed by many of the upstream devices.
Petition 870170063697, of 8/29/2017, p. 10/118
2/67 [005] The approaches described in this section are approaches that could be pursued, but not necessarily approaches that were previously conceived or pursued. Therefore, unless otherwise indicated, any of the approaches described in this section should not be considered to qualify as a prior art by virtue of their inclusion in this section. Similarly, problems identified in relation to one or more approaches should not be taken as being recognized in any prior art, based on this section, except where indicated otherwise.
BRIEF DESCRIPTION OF THE DRAWINGS [006] The present invention is illustrated by way of example, and not by way of limitation, in the figures in the accompanying drawings, and in which similar reference numbers refer to similar elements, and in which:
[007] Figure 1A illustrates a video encoder;
[008] Figure 1B illustrates a video decoder;
[009] Figure 2 illustrates an exemplary method of approximating a remodeling function;
[010] Figures 3A and 3B illustrate exemplary processing flows for approaching a target LUT (lookup table);
[011] Figures 4A to 4C illustrate exemplary algorithms for image classification;
[012] Figures 5A and 5B illustrate exemplary processing flows for adaptively remodeling and reverse mapping; and [013] Figure 6 illustrates an exemplary hardware platform on which a computer or computing device, as described in the present invention, can be implemented.
DESCRIPTION OF EXAMPLIFYING MODALITIES [014] Exemplifying modalities related to coding, decodesPetition 870170063697, of 29/08/2017, p. 10/128
3/67 tion and representation of video data are described here. In the following description, for the sake of explanation, numerous specific details are presented in order to provide a complete understanding of the present invention. It will be apparent, however, that the present invention can be practiced without these specific details. In other cases, well-known structures and devices are not described in exhaustive detail, to avoid unnecessarily confusing or obscuring the present invention.
[015] Exemplary modalities are described here according to the following scheme:
1. OVERVIEW
2. CODIFICATION OF REMODELED VIDEO CONTENT ADAPTATIVELY
3. DECODING OF REMODELED VIDEO CONTENT ADAPTATIVELY
4. POWER FUNCTIONS FOR ADAPTIVE REMODELING
5. APPROACH FUNCTIONS RELATED TO ADAPTIVE REMODELING
6. EXAMPLE OF PROCESSING FLOWS
7. REAL-TIME OPTIMIZATIONS
8. IMPLEMENTATION MECHANISMS - HARDWARE OVERVIEW
9. EQUIVALENTS, EXTENSIONS, ALTERNATIVES AND OTHER
1. Overview [016] This overview provides a basic description of some aspects of an exemplary embodiment of the present invention. It should be noted that this overview is not an extensive and exhaustive summary of aspects of the exemplifying modality. In addition, it should be noted that this overview is not intended to be understood as identifying any aspects or elements that are particularly significant 870170063697, of 8/29/2017, p. 10/138
4/67 of the exemplifying modality, nor as a description of any scope of the specific exemplifying modality, nor of the invention in general. This overview simply presents some concepts that refer to the exemplifying modality in a condensed and simplified format, and should be understood as a mere conceptual prelude to a more detailed description of the exemplifying modalities that follows below.
[017] Without the use of adaptive remodeling, as described here, a video codec assigns code words to a relatively low bit depth (for example, 8-bit video signals on two channels, 10-bit video signals on a single channel, etc.) and may therefore not preserve the visual details of the source video content (for example, video content encoded by reference, video content encoded by reference to PQ (perceptual quantizer), etc.) coded with code words available at a relatively high depth. For example, a codec (for example, a gamma-domain codec, etc.), which does not use adaptive remodeling, as described here, can over-allocate code words to high-luminance sub-bands (for example, light portions, highlights, etc. .) and sub-allocate code words to a low luminance subrange (eg dark portions, dark areas, etc.). As a result, the visual details of the perceptually encoded source video content can be unnecessarily lost using these other techniques.
[018] A video codec that implements adaptive remodeling techniques, as described in the present invention, allocates available code words to a specific bit depth (eg 8 bits, etc.) in a way that preserves the visual details in a wide variety of source video content. In some embodiments, a video codec selects / determines specific parameter values (for example, exponential values in power functions, slope in linear quantization, pivots in linear quantization by parts,
Petition 870170063697, of 8/29/2017, p. 10/148
5/67 etc.) based on results of analysis of the image content transferred by image frames (for example, in a scene, etc.) in the content of the source video. If the image content comprises more highlighted portions or more luminance levels, the parameter values can be selected / determined to make more luminance levels represented in the high luminance subrange available for encoding and decoding operations. If the image content comprises fewer highlighted portions or less luminance levels, parameter values can be selected / determined to make less luminance levels represented in the high luminance subrange available for encoding and decoding operations. Similarly, if the image content comprises more dark portions or more luminance levels, the parameter can be selected / determined to make more luminance levels represented in the low luminance subrange available for encoding or decoding operations. If the image content comprises fewer dark portions or less luminance levels, parameter values can be selected / determined to make less luminance levels represented in the low luminance subrange available for encoding and decoding operations.
[019] The techniques described here can be used to support encoding operations (for example, encoding, decoding, transcoding, etc.) for video signals (for example, encoded bitstreams, etc.) that comprise a single layer, or more than one layer (for example, two layers, multiple layers, a combination of a base layer and one or more enhancement layers, etc.). These techniques can be implemented by software, hardware, a combination of software and hardware, and can be adopted by a variety of computing devices, multimedia devices, mobile devices, etc. At least some of the techniques can be brought together in the form of
Petition 870170063697, of 8/29/2017, p. 10/158
6/67 one or more profiles of technical characteristics (for example, a roaming profile, a tablet profile, a home entertainment system profile, etc.), which can be released, independently or in combination, to other suppliers, developers, manufacturers, etc.
[020] A video codec that implements the techniques described here, for adaptive remodeling of the video signal, can be used to support one or more backward compatible video applications, non backward compatible video applications, etc. Examples of systems with such a video codec may include, but are not limited to, any of the following: a single-layer 12-bit codec, a double-layer 8-bit codec, a multilayered codec, a non-remodeling codec backwards compatible, a backwards compatible codec, a codec that implements a set of settings / requirements / options in advanced video encoding (AVC), a codec that implements a set of settings / requirements / options in high efficiency video encoding (HEVC), etc.
[021] Some examples of non-retrocompatible remodeling codecs are described in patent application PCT / US2014 / 031716, filed on March 25, 2014, owned by the assignee of this application, the contents of which are incorporated herein by reference. for all purposes as if fully presented here. Some examples of retrocompatible remodeling codecs are described in patent application PCT / US2012 / 062932, filed on November 1, 2012, owned by the assignee of the present application, the contents of which are incorporated by reference for all purposes. as if fully presented here.
[022] In some modalities, a codec system as described here implements a curve approximation method to approximate an arbitrary remodeling function, using a limited number of polynomials that minimize a maximum global error. Additional, optionally or alternatively, the system
Petition 870170063697, of 8/29/2017, p. 10/168
7/67 of codec can implement an adaptive parameter selection algorithm to determine or choose the adaptive remodeling parameters used in conjunction with the remodeling function, to obtain a better perceptual quality than would otherwise be possible.
[023] A source video package for a media program can have a relatively large file size, as the source video package can comprise source video content with relatively high spatial resolution (e.g., 4k, 8k, etc.), a relatively large dynamic range, and relatively broad color gamut. In some embodiments, the content of the source video that has been encoded into a video signal with relatively high bit depth (for example, a 12 bit video signal, a 14+ bit video signal, etc.) can be transcoded into much smaller encoded video content, based, at least in part, on adaptive signal reshaping.
[024] For example, the source content can be encoded into a 12-bit PQ video signal, with source code words that correspond to luminance-related values or chroma-related values (for example, fixed, remodeled) non-adaptive, etc.) from image to image, from scene to scene, from media program to media program, etc.
[025] The term PQ, as used here, refers to the perceptual quantization of the luminance amplitude. The human visual system responds to increasing levels of light in a very non-linear way. The human ability to see a stimulus is affected by the luminance of that stimulus, the size of the stimulus, the spatial frequencies that make up the stimulus, and the level of luminance to which the eyes adapt at the specific moment when the stimulus is seen. In one embodiment, a perceptual quantizer function maps the gray levels of linear input to produce gray levels that best correspond to contrast sensitivity thresholds in the system
Petition 870170063697, of 8/29/2017, p. 10/178
8/67 human visual. Exemplary PQ (or EOTFs) mapping functions are described in SMPTE ST 2084: 2014 High Dynamic Range EOTF of Mastering Reference Displays, which is integrally incorporated here, by reference, where a fixed stimulus size is given for each luminance level (ie ie, the stimulus level), a minimum visible contrast stage at that luminance level is selected, according to the most sensitive level of adaptation and the most sensitive spatial frequency (according to HVS models). In comparison to the traditional gamma curve, which represents the response curve of a physical cathode ray tube (CRT) device and which, coincidentally, can have a very close similarity in relation to how the human visual system responds, a PQ curve mimics the true visual response of the human visual system using a relatively simple functional model.
[026] The source content encoded in the 12-bit PQ video signal can have a relatively high dynamic range, such as a dynamic range of up to 12,000 nits, etc. In contrast, the encoded video content can be encoded into a 10-bit video signal with adaptively remodeled code words, which do not necessarily correspond to fixed values related to luminance or related to chroma, image by image, scene the scene, from media program to media program, etc. Code words adaptively remodeled in a 10-bit code word space can be (for example, adaptively) mapped to the source code words in a 12-bit code word space, based on a adaptive remodeling function that can vary from image to image, from scene to scene, from media program to media program, etc. As a result, encoded video content, while encoded in a 10-bit signal, can support a relatively high dynamic range, even up to the full dynamic range supported by the source video content, which is encoded in a 12-bit signal . The adaptive remodeling function can be represented by one or more quantization curves, research tables (LUTs), code word mappings, etc.
Petition 870170063697, of 8/29/2017, p. 10/188
9/67 [027] In some modalities, some or all of the quantization curves, research tables (LUTs), code word mappings, etc., which represent the adaptive remodeling function used by an upstream device to perform the adaptive reshaping of the source video signal, can be transmitted as composition metadata, with the encoded video content being encoded into the 10-bit signal from the upstream device (e.g., a video encoder, a video transcoder, etc.). ) directly or indirectly to downstream receiver devices (for example, a video decoder, a video transcoder, etc.). The adaptive remodeling function, as represented by quantization curves, lookup tables (LUTs), code word mappings, etc., can be used by downstream receiver devices to reconstruct a version of the source video content from the encoded video content. For example, code words adaptively remodeled in the encoded video content can be mapped in reverse, based on the adaptive remodeling function or an inverse of it, in a set of code words (for example, in a space of 12-bit code words, etc.) equivalent to the source code words that were in the content of the source video.
[028] The composition metadata, which includes a representation of the adaptive remodeling function, may be too large to be transmitted and / or processed by downstream devices. Downstream devices with difficulties in processing relatively large amounts of metadata (for example, related to video processing, such as composition metadata, etc.) may include, but are not limited to just any of the following: mobile devices, compact devices, devices computing devices with relatively limited resources for video processing, computing devices that incorporate chip system modules (SoC) with relatively limited resources for processing video
Petition 870170063697, of 8/29/2017, p. 10/198
10/67 video, computing devices that incorporate video signal formats, implementations, designs, hardware, software, firmware, etc., that support the transmission / reception of relatively small amounts of metadata, etc.
[029] Using the techniques described here, an adaptive remodeling function can be approximated through a limited number of simple mathematical functions, such as polynomials, linear parts by segments (PWL), etc. In a non-limiting exemplary modality, an adaptive remodeling function is approximated through a limited number (for example, 4, 8, 12, a positive integer different and greater than one, etc.) of polynomial functions (for example, linear , second order, etc.) with a limited number of bits for each coefficient and a rounding errors minimized. The minimization of approximation errors when approximating the adaptive remodeling function with the limited number of polynomial functions leads to the minimization of errors in the inverse mapping, which is performed based on the approximation of the adaptive remodeling function with the limited number of polynomial functions.
[030] Adaptive remodeling can be performed on multiple video signals and / or video content in many different ways. In particular, the techniques described here are applicable for the approximation of any adaptive remodeling function, including, but not limited to, an arbitrary LUT.
[031] In some exemplifying modalities, mechanisms such as those described here are part of a media processing system, including, but not limited to, any of them: a portable device, a game machine, a television, a laptop, a netbook, a tablet, a cell phone, a digital book reader, a point of sale (POS) terminal, a desktop computer, a computer kiosk (self-service), or various other types of terminals and media processing units .
[032] Various modifications to the preferential modalities and principles and Repeat 870170063697, of 29/08/2017, p. 10/20
11/67 generic courses described here will be readily apparent to those skilled in the art. Thus, the disclosure is not intended to be limited to the modalities shown, but must agree with the broader scope, consistent with the principles and resources described here.
2. Encoding adaptively refurbished video content [033] Figure 1A illustrates an exemplary video encoder 102 that can be used as an upstream device to provide an encoded output video signal (or a refurbished video signal) with the video content adaptively remodeled for downstream devices (one of which may be, for example, a video decoder 152 of Figure 1B, etc.). The video encoder (102) can be implemented to one or more computing devices. In some embodiments, the video encoder (102) comprises a source content decoder 104, an adaptive content remodeler 106, a remodeled content encoder 108, etc.
[034] In some embodiments, the source content decoder (104) comprises software, hardware, a combination of hardware and software, etc., configured to receive one or more source video signals (for example, bit streams encoded, etc.), and decode the source video signals into source video content. In some embodiments, the source video content is decoded from a single layer video signal, encoded with the source video content in a single layer. In some embodiments, the source video content is decoded from a multi-layer encoded video signal, encoded with the source video content in more than one layer (for example, a base layer and one or more layers of improvement, etc.).
[035] In some embodiments, the adaptive content shaper (106) comprises software, hardware, a combination of software and hardware, etc.,
Petition 870170063697, of 8/29/2017, p. 10/21
12/67 configured to perform adaptive remodeling operations on the source video content to generate refurbished video content. One or both of the source video content or the refurbished content can be used in one or more of backward compatible (BC) video applications, backward compatible (NBC) video applications, etc.
[036] In some modalities, the adaptive content remodeler (106) is configured to select and apply a remodeling function, to remodel source code words in one or more images, one or more scenes, etc., represented in the content source video in redesigned code words, in one or more corresponding images, one or more corresponding scenes, etc., represented in the remodeled video content. According to the techniques described here, the selection of the remodeling function and / or the adaptive remodeling parameters used in the remodeling function is done adaptively, based on the actual content of the images, scenes, etc., as represented in the video content. source. In addition, optionally or alternatively, the selection of the remodeling function and / or the adaptive remodeling parameters used in the remodeling function can be done adaptively, while these images, scenes, etc., are being processed by the video encoder (102 ).
[037] The adaptive content remodeler (106) can, but is not limited to, being configured to use direct power functions, such as remodeling functions. The adaptive content remodeler (106) can be configured to determine whether an image contains large soft light areas, large dark black areas, etc., whether an image is a halftone image, etc. Based on this determination, adaptive remodeling parameters, such as exponential values of the direct power functions, etc., can be selected.
[038] In some modalities, the adaptive content remodeler (106) applies adaptive remodeling operations in source code words in content 870170063697, of 29/08/2017, p. 10/22
13/67 video source directly, based on an adaptive remodeling function selected with selected adaptive remodeling parameters.
[039] In some modalities, an adaptive remodeling function can be represented by an LUT that comprises a plurality of records, where each one maps a source code word into a set of available source code words, used to code the source video content for a redesigned codeword in an available set of redesigned codewords used to encode the redesigned video content. A first LUT used to reshape one or more first images in the source video content may be different from a second LUT used to reshape one or more second images in the source video content. In some modalities, the set of available source code words may remain the same for both the first images and the second images. For example, if the adaptive content remodeler (106) determines that the first images are smooth clear images, then the LUT, or the adaptive remodeling function that the LUT represents, may have a relatively large number of remodeled code words available corresponding to bright luminance values. As a result, artifacts such as contours / strips can be reduced or avoided, even when the refurbished video content is encoded into a refurbished video signal (for example, two-layer 8-bit video signal, 10-bit video signal in single layer, etc.) with a bit depth less than a source video signal (for example, 12 bit video signal, 14+ bit video signal, etc.). On the other hand, if the adaptive content remodeler (106) determines that the second images are smooth dark images (but not a pure black image), then the LUT or the adaptive remodeling function that the LUT represents, may have a relatively small number. large number of refurbished code words available that correspond to luminance valuesPetition 870170063697, of 8/29/2017, p. 10/23
14/67 dark company. As a result, image details in dark areas can be preserved in the refurbished video content encoded in the refurbished video signal. In some embodiments, the adaptive content remodeler (106) applies adaptive remodeling operations to source code words in the source video content based on a LUT, the LUT can be generated based on a selected adaptive remodeling function , or the LUT itself can be considered as a selected adaptive remodeling function.
[040] In some modalities, the adaptive content remodeler (106) determines an approximation of a LUT (target) that represents, or is equivalent to, a remodeling function. For example, the adaptive content remodeler (106) can, but is not limited to, approaching LUT through polynomials with specifically determined coefficients, to minimize errors between the mapping represented by the polynomials and the mapping represented by the target LUT. In some embodiments, the adaptive content reshaper (106) applies adaptive reshaping operations to source code words in the source video content based on polynomials that approximate the target LUT, or a reshaping function represented by the LUT.
[041] Regardless of how the video encoder (102) can apply adaptive remodeling operations (for example, based on a remodeling function, such as an analytical or non-analytical function or a pieced-up function, based on a LUT that may or may not represent an analytical function, based on an LUT approximation that may or may not be generated based on an analytical function, etc.), the video encoder (102) can be configured to generate one or more types of adaptive remodeling parameters and transmit at least one among one or more types of adaptive remodeling parameters to downstream receiver devices.
[042] In some modalities, the adaptive content remodeler (106)
Petition 870170063697, of 8/29/2017, p. 10/248
15/67 is configured to determine an LUT (target) approach (or LUT with inverted search (backward LUT)) that represents an inverse of a remodeling function. Composition metadata that define the target LUT approximation and that represent the inverse of the remodeling function can be generated and transmitted as part of the total metadata transmitted in the remodeled video signal by the video encoder (102) to downstream receiver devices, such as a video decoder 152 of Figure 1B, etc.
[043] In some modalities, the video decoder (152) can be configured to receive or reconstruct the target LUT approximation that represents the inverse of the remodeling function, based on the decoded / extracted composition metadata from the remodeled video signal. The video decoder (152) can be configured to apply reverse mapping operations to the remodeled video content originating from the video encoder, in the decoded form of the remodeled video signal using the target LUT approach, regardless of whether the content remodeler adaptive (106) applies adaptive remodeling operations to the source code words in the source video content based on a remodeling function, or alternatively, based on a LUT with direct search that represents the remodeling function, or alternatively, based on an LUT approach with direct research.
[044] Additional, optionally or alternatively, in some modalities, the adaptive content remodeler (106) is configured to generate composition metadata that defines a target LUT that represents the inverse of a remodeling function, and to transmit the composition metadata as part of the total metadata transmitted in the video signal remodeled by the video encoder (102) to downstream receiver devices, such as a video decoder 152 of Figure 1B, etc. In some embodiments, the video decoder (152) can be configured to receive or reconstruct the target LUT based on the composition metadata
Petition 870170063697, of 8/29/2017, p. 10/258
16/67 decoded / extracted from the remodeled video signal. The video decoder (152) can be configured to apply reverse mapping operations to the remodeled video content originating from the video encoder, in the decoded form of the remodeled video signal using the target LUT, regardless of whether the adaptive content remodeler ( 106) applies adaptive remodeling operations to the source code words in the source video content based on a remodeling function, or alternatively, based on a LUT with direct search that represents the remodeling function, or alternatively, based on in an LUT approach with direct research.
[045] Additional, optionally or alternatively, in some modalities, the adaptive content remodeler (106) is configured to generate composition metadata that defines an inverse of a remodeling function, and to transmit the composition metadata as part of the total transmitted metadata in the video signal remodeled by the video encoder (102) for downstream receiver devices, such as a video decoder 152 of Figure 1B, etc. In some embodiments, the video decoder (152) can be configured to receive or reconstruct the reverse of the reshaping function based on the composition metadata decoded / extracted from the reshaped video signal. The video decoder (152) can be configured to apply reverse mapping operations to the remodeled video content originating from the video encoder, in the decoded form of the remodeled video signal using the reverse of the remodel function, regardless of whether the remodeler adaptive content (106) applies adaptive remodeling operations to the source code words in the source video content based on a remodeling function, or alternatively, based on a LUT with direct search that represents the remodeling function, or alternatively , based on an LUT approach with direct research.
[046] In some embodiments, the refurbished content encoder (108)
Petition 870170063697, of 8/29/2017, p. 10/26
17/67 comprises software, hardware, a combination of software and hardware, etc., configured to encode the refurbished video content into a refurbished video signal (for example, a two-layer 8-bit video signal encoded with the refurbished video signal, a single layer 10-bit video signal, encoded with the refurbished video signal, etc.). In addition, optionally or alternatively, in some embodiments, the video encoder (102), or the remodeled content encoder contained therein, produces metadata that comprises some or all of the operational parameters used in the operations of the video encoder (102) as part of the redesigned video signal to a downstream device (e.g., a video decoder 152 of Figure 1B, etc.). The operational parameters in the metadata transmitted to downstream devices include, but are not limited to, any of the following: composition metadata that comprise adaptive remodeling parameters that define remodeling or inverse functions thereof, composition metadata that define LUTs that represent remodeling or inverses thereof, composition metadata that define approximations of one or more remodeling functions or inverses of remodeling functions, one or more tone mapping parameters, clipping parameters, exponential values used in power functions for gamma compression, inverse mapping parameters, LUTs, pivot values in linear piecewise functions (PWL), nonlinear quantization parameters, nonlinear quantization parameters (NLQ), etc. Metadata can be part of data transmitted in layers containing encoded video content, or in a bit stream separate from a total video stream, for example, as part of supplementary enhancement information (SEI) or other metadata holders available on the video bit stream. An exemplary bit subflow can be a reference processing unit (RPU) flow developed by Dolby Laboratories, Inc.
[047] As used here, the term bit depth refers to the number of
Petition 870170063697, of 8/29/2017, p. 10/278
18/67 bits provided in an encoding space that provides code words available to encode or quantize image data; an example of low bit depth is 8 bits; an example of high bit depth is 12 bits or more.
[048] As used here, video content can comprise a sequence of images or frames. As used here, a source image can refer to an image, such as an image referring to a scene captured by a high standard image capture device, an encoded reference image, an encoded image referring to PQ, etc. A source image can comprise code words available in a code word space with a relatively large bit depth.
[049] An image as a source image, a reshaped image, a reconstructed image, etc., can be in any color space. For example, a source image could be a 12+ bit image in a YCbCr color space, an RGB color space, an XYZ color space, a YDzDx color space, an IPT color space. , etc. In an example, each pixel represented in an image comprises code words for all channels (for example, RGB channels, luma and chroma channels, XYZ channels, YDzDx channels, IPT channels, etc.) defined for a color space (for example, example, a YCbCr color space, an RGB color space, an XYZ color space, a YDzDx color space, an IPT color space, etc.). Each codeword comprises oversampled or subsampled code words for one or more of the channels in the color space. In an exemplary embodiment, the video encoder (102) is configured to perform a color space transformation related to an image in a first color space (for example, an RGB color space, etc.) to a second color space. different color (for example, a YCbCr color space, etc.).
[050] In an exemplary mode, the video encoder (102) is configured to subsample or oversample an image in a first format. 870170063697, of 29/08/2017, p. 10/288
19/67 sampling bush (for example, in a 4: 4: 4 sampling format, etc.) for a different second sampling format (for example, in a 4: 2: 0 sampling format, etc. ).
[051] Examples of a video encoder that implements signal reshaping and other operations include, but are not limited to just any of the following: one or more single-layer 12-bit codecs, one or more two-layer 8-bit codecs , one or more multilayer codecs, one or more non-retrocompatible remodeling codecs, one or more retrocompatible codecs, one or more codecs that implement a set of configurations / requirements / options in AVC, one or more codecs that implement a set of settings / requirements / options in HEVC, H.264 / AVC / HEVC, MPEG-2, VP8, VC-1, etc.
3. Decoding adaptively refurbished video content [052] Figure 1B illustrates an exemplary video decoder 152 that can be used as a downstream device to process an input video signal (or a refurbished video signal) encoded with the video content adaptively remodeled from upstream devices (one of which may be, for example, a video encoder 102 of Figure 1A, etc.). The video decoder (152) can be implemented to one or more computing devices. In some embodiments, the video decoder 152 comprises a refurbished content decoder 154, an inverse mapper 156, a display manager 158, etc.
[053] In some embodiments, the refurbished content decoder (154) comprises software, hardware, a combination of hardware and software, etc., configured to receive one or more incoming video signals (for example, encoded bit streams , etc.), and decode incoming video signals into refurbished video content. In some embodiments, the refurbished video content is decoded from a single layer video signal (for example,
Petition 870170063697, of 8/29/2017, p. 10/29
20/67 example, a single channel 10-bit video signal, etc.) encoded with the video content reshaped into a single layer. In some embodiments, the refurbished video content is decoded from a multi-layer encoded video signal (for example, a two-layer encoded video signal), encoded with the refurbished video content in more than one layer (for example example, a base layer and one or more enhancement layers, etc.).
[054] In some embodiments, the reverse mapper (156) comprises software, hardware, a combination of hardware and software, etc., configured to perform reverse mapping operations on the refurbished video content to generate a reconstructed version of the video content source, used by an upstream device to generate refurbished video content. One or both of the reconstructed video content or the remodeled content can be used in one or more of backward compatible (BC) video applications, backward compatible (NBC) video applications, etc.
[055] In some modalities, a remodeling function was adaptively selected by an upstream device (for example, video encoder 102 in Figure 1A, etc.) to remodel source code words in one or more images, one or more scenes, etc., represented in the source video content in redesigned code words in one or more corresponding images, one or more corresponding scenes, etc., represented in the remodeled video content. According to the techniques described here, the selection of the remodeling function and / or the adaptive remodeling parameters used in the remodeling function is done adaptively, based on the actual content of the images, scenes, etc., as represented in the video content. source.
[056] Examples of remodeling functions may include, but are not limited to, direct power functions, etc. The remodeling parameters
Petition 870170063697, of 8/29/2017, p. 10/30
21/67 adaptive used in a reshaping function applied to reshape an image on the upstream device can be determined / selected by the upstream device, based on whether the image contains large smooth light areas, large dark black areas, etc., if an image is a halftone image, etc.
[057] In some embodiments, the refurbished video content received by the video decoder (152) is generated through the application, by the upstream device, of adaptive remodeling operations on source code words in the source video content directly, based on an adaptive remodeling function with selected adaptive remodeling parameters.
[058] In some embodiments, the refurbished video content received by the video decoder (152) is generated by the upstream device based on a LUT (for example, a LUT with direct search, etc.), which may or may not be generated based on a selected adaptive remodeling function.
[059] In some embodiments, the remodeled video content received by the video decoder (152) is generated by the upstream device based on an LUT (target) approximation that represents or is equivalent to a remodeling function. The approximation may or may not be based on polynomials.
[060] Regardless of how the refurbished video content received by the video decoder (152) is generated by the upstream device, the video decoder (152), or the reverse mapper (156) contained therein, it can be configured to obtaining adaptive remodeling parameters by decoding the composition metadata that is transmitted as part of the metadata transferred in the incoming video signal received by the video decoder (152).
[061] In some modalities, based on the decoded adaptive remodeling parameters, the inverse mapper (156) is configured to determine
Petition 870170063697, of 8/29/2017, p. 10/318
22/67 an approximation of a LUT (target) (or a LUT with inverted search) that represents an inverse of a remodeling function (for example, a remodeling function used by the upstream device to perform adaptive remodeling in one or more images, etc.). The video decoder (152), or the reverse mapper (156) contained therein, can be configured to generate a reconstructed version of the source video content (used by the upstream device to generate the refurbished video content received by the video decoder 152) by applying reverse mapping operations to the remodeled video content originating from the video encoder in decoded form, from the remodeled video signal using the target LUT approximation, regardless of whether the upstream device applies adaptive remodeling operations the source code words in the source video content based on a remodeling function, or alternatively based on a LUT with direct reading that represents the remodeling function or, alternatively, based on an approximation of the LUT with direct search.
[062] In some embodiments, the display manager (158) comprises software, hardware, a combination of hardware and software, etc., configured to perform video processing operations, such as display management operations, etc., in the reconstructed version of the source video content. Display management operations can include, but are not limited to, any of the following: tone mapping operations, cropping operations, color gamut adaptation operations, etc. Some or all of these operations may be specific to the device. Through these operations, the images represented in the reconstructed version of the original video content can be presented by a display device, which can be part of the same device that includes the video decoder (152), and can operate in conjunction with the video decoder. video (152), etc.
[063] In an exemplary mode, the video decoder (152) is
Petition 870170063697, of 8/29/2017, p. 10/328
23/67 configured to oversample or subsample an image in a first sampling format (for example, in a 4: 2: 0 sampling format, etc.) to a different second sampling format (for example, in a sampling format 4: 4: 4 sampling, etc.).
[064] Examples of a video decoder that implements reverse mapping, reverse signal reshaping and other operations include, but are not limited to, any of the following: one or more single-layer 12-bit codecs, one or more 8-bit codecs bits in two layers, one or more multilayer codecs, one or more non-retrocompatible remodeling codecs, one or more retrocompatible codecs, one or more codecs that implement a set of configurations / requirements / options in AVC, one or more codecs that implement a set of configurations / requirements / options in HEVC, H.264 / AVC / HEVC, MPEG-2, VP8, VC-1, etc.
4. Power functions for adaptive remodeling [065] In some modalities, adaptive remodeling can be performed effectively with power functions, for example, on video signals that support extended dynamic ranges (EDR) (for example, up to 6,000 nits, 12,000 nits, 20,000+ nits, etc.). Power functions can be used to compress a source video signal with a relatively high bit rate, such as a perceptually quantized 12+ bit video signal (PQ), etc., into an adaptively remodeled video signal with a relatively low bit rate, such as an adaptively refurbished 8-bit or 10-bit video signal, etc. Optimized adaptive remodeling parameters can be selected based, at least in part, on the content of the source video signal, to reduce or present visual artifacts in the adaptively remodeled video signal. The selection of these optimized adaptive remodeling parameters can be made automatically through an upstream device for a current image, a current scene, etc., as reprePetition 870170063697, of 29/08/2017, pg. 10/33
24/67 sitting on the source video signal, as the current image, the current scene, etc., is being processed and remodeled in an adaptive / compressed manner by the upstream device in an image, a scene, etc., represented in the video signal adaptively remodeled. Some examples of adaptive remodeling with power functions are described in patent application PCT / US2014 / 031716, filed on March 25, 2014, owned by the assignee of the present application, the contents of which are incorporated herein by reference for all the purposes as if fully presented here.
[066] In an exemplary modality, an adaptive remodeling is performed by an upstream device, such as a video encoder, etc., with a direct power function, as follows:
s { = Rounding ((c Y - c Y L χ ν ; + c [) (1) ^ H ~ ^ L where α represents an exponential value; v- t represents source code words (for example, code words luminance source, etc.) decoded from a source video signal that is being remodeled by the video encoder; s, represents adaptively remodeled code words (for example, adapted / mapped luminance code words) adapted / mapped from v with the direct power function; Rounding (...) represents a rounding function; C [and CH are minimum and maximum values, respectively, of the adaptively modeled code words (for example, adapted / mapped luminance code words, etc.); v Y L and Y Y are minimum and maximum values, respectively, of the source code words (for example, luminance source code words, etc.). , optionally or alternatively, in some modalities, a function clipping can be used to ensure that any code word out of range (for example, out of range [C [, C Y H ], etc.) after lossy compression can still be mapped in reverse by a receiving device downstream, such as a video decoder, etc., for the word 870170063697, of 08/29/2017, p. 10/34
25/67 closest valid reconstructed source code.
[067] The reverse mapping can be performed using a downstream receiver device, such as a video decoder, etc., with an inverse power function as follows:
s _ C Y
V = (ή - VY + ν Υγ (2)
CH - CY where it represents reconstructed source code words (for example, reconstructed luminance source code words, etc.) mapped in reverse from remodeled code words, decoded from a video signal that has been remodeled in a different way adaptive, which was remodeled by an upstream device, such as the video encoder in the present example.
[0068] In some modalities, in the functions of direct and inverse power in the decoder, and C Y H can be adjusted as follows:
CY = 0 (3)
C Y H = effective_codewords - 1 (4) where the symbol effective_codewords represents the number of code words available to represent the adaptively remodeled code words (for example, 511 of an 8-bit double layer video signal, 1023 in a single-layer 10-bit video signal, etc.).
[069] A remodeling function or an inverse of it, such as a direct power function, an inverse power function, etc., can be represented in the form of a LUT, as a one-dimensional LUT (1D-LUT), etc. . The techniques described here can use a LUT that represents, or that is generated based on a remodeling function or an inverse of it, as a target LUT for approximation with a limited number of second order polynomials. In some embodiments, the coefficients may be fixed-point numbers with limited accuracy, for example, to satisfy device-specific restrictions, module-specific restrictions (for example, SoC-related restrictions,
Petition 870170063697, of 8/29/2017, p. 10/35
26/67 etc.), etc.
[070] Take as an example a power function like the one shown in expression (1) or (2), an exponent (for example, α in the direct power function of expression (1), 1 / α in the inverse power function of expression (2), etc.) can be greater than one (1), making the power function a convex function, or it can be less than one (1), making the power function a concave function. These different alpha values create difficulties in approaching a target LUT generated based on the power function.
[071] In several modalities, one or more algorithms can be used individually or in combination to approximate the target LUT. In one example, a first algorithm, designated as a forward search algorithm, can be used to approximate the power function from small codeword values to large (or left to right) codeword values across a horizontal coordinate axis that represents input codeword values (where a vertical coordinate axis represents the mapped codeword values). For use in the present invention, the input code word values in a LUT, such as the target LUT, etc., may refer to keys in LUT key-value pairs, whereas the code word values mapped in the LUTs can refer to values in the LUT key-value pairs. In another example, a second algorithm, designated as an inverted search algorithm, can be used to approximate the power function from large codeword values to small (or right to left) codeword values along the horizontal coordinate axis. In an additional example, both the first and the second algorithms that can be used that result from one of these algorithms (for example, generating less approximation errors, faster convergence, etc.) can be used to approximate the power function.
[072] It should be noted that an adaptive remodeling function can or
Petition 870170063697, of 8/29/2017, p. 10/36
27/67 is not a direct power function. In addition, optionally or alternatively, an inverse of a remodeling function may or may not be an inverse power function. In some modalities, an inverse of a remodeling function, as described here, is represented by an LUT with an inverted search optimized (designated as BL ()) that can be derived or deduced from any arbitrary remodeling function. The remodeling function can be used by an upstream device, such as a video encoder, etc., to perform adaptive remodeling. In some modalities, the adaptive remodeling function can be represented by a LUT with direct research (designated as FL ()). A LUT with optimized direct search (or BL () optimum) can be used by a downstream device, such as a video encoder, etc., to perform reverse mapping.
[073] For each redesigned codeword value, Sc, used in the redesigned video content, all pixels (for example, in an image, images in the same scene, etc.) that have the same codeword value remodeled Sc in the remodeled video content are grouped. Based on these pixels in the refurbished video content, a set of corresponding source code word values in the source video content that were remodeled or mapped to Sc are then determined or identified, as follows:
a (s c ) = {i IFL (i) = sj (5) [074] For each Sc code word value, if your corresponding source code word value set is not empty, then the mean of all source code word values collected in the set are collected or calculated. The average of all source code word values collected, corresponding to each code word Sc value, can be used to construct the optimal BL (Sc), as follows,
Σ v, ie s c )
I ω) I
BL (s „) = (6) where I ® (s c ) I represents the number of source code word values
Petition 870170063697, of 8/29/2017, p. 37/108
28/67 collected in the set of expression (5) above.
[075] In some modalities, the optimal BL (Sc) in expression (6) can be used as a target LUT to be approximated (for example, by polynomials, etc.).
5. Approximation functions related to adaptive remodeling [076] Figure 2 illustrates an example method of approximating a remodeling function. One or more computing devices, one or more modules at least partially implemented in the hardware on a computing device, etc., can perform this method. For example, a LUT approximation module on a video processing device, such as an adaptive remodeling module on a video encoder, a reverse mapping module on a video decoder, etc., can perform some or all of the actions in the method of Figure 2.
[077] In block 202, the LUT approach module initiates the approach of a target LUT by defining an initial error threshold t.
[078] In block 204, the LUT approach module determines whether a continuity condition of the module must be met. If so, then the polynomials used to approximate the target LUT must satisfy a constraint that a part curve formed by the polynomials is continuous in order 0; any two curved segments, as represented by two polynomials neighboring the part curve, join together. On the other hand, if the continuity condition is not to be fulfilled, such a restriction need not be satisfied by the polynomials (neighboring polynomials may or may not join one another). Whether to enable or disable this continuity constraint may depend on the content. In one example, the LUT approximation module can determine that an image or a scene contains smooth image content. In response to this determination, the continuity constraint can be performed when approaching the target LUT for remodeling or reverse mapping operations on such an image or scene. This can prevent artifacts from occurring
Petition 870170063697, of 8/29/2017, p. 10/38
29/67 as tracks in the relatively smooth image content. In another example, the LUT approximation module can determine that an image or scene contains image content with relatively high variation (for example, in terms of differences and variations in luminance values or chroma values, etc.). In response to this determination, the continuity constraint may not be performed when approaching the target LUT for remodeling or reverse mapping operations on such an image or scene, as artifacts such as color bands are less likely to occur in varying image content relatively high.
[079] In some modalities, the LUT approach module can select a set of stop rules, from one or more different sets of stop rules, to be applied in the approach operation based on the fact that the condition or restriction continuity must be performed or not. In response to the fact that the continuity condition should not be performed, in block 206, the LUT approach module can be configured to select a first set of stop rules. On the other hand, in response to the fact that the continuity condition must be executed, in block 208, the LUT approach module can be configured to select a second set of stopping rules.
[080] A stop rule can refer to a rule (or a part of a complex rule) used to determine, at least in part, whether or not to end a segment approach, to stop or end a calculation or operation change to a different calculation or operation, etc., when the target LUT is approached. In some embodiments, the stop rules may contain not only a threshold detector, but also a rising edge detector, minimum / maximum segment length detector, etc. Stop rules (for example, a specific combination of stop rules, etc.) can be used to produce better fit accuracy when finalizing an approach
Petition 870170063697, of 8/29/2017, p. 10/39
30/67 segment, than simply using a threshold detector based on a total error threshold.
[081] In various modalities, different stopping rules can be adopted based on the types of image, types of remodeling function, etc. For example, for remodeling functions represented by curves that are difficult to approach through polynomials, a relatively relaxing stop rule can be adopted. For relatively smooth images, a relatively strict stopping rule can be adopted. In some modalities, a stop rule can be equivalent to a reduction in degrees of freedom in approach operations. The more degrees of freedom (e.g., one degree of freedom, two degrees of freedom, etc.) a stop rule represents, the greater the distortion than an approximation of a target LUT, generated in part based on the stop rule, can generate. The approach error can be minimized if no stop rules or relatively relaxing stop rules are used. However, the approach may comprise curved segments that are not at their ends, and may or may not be appropriate for specific types of images (for example, relatively smooth images, relatively non-smooth images, etc.).
[082] In an exemplary implementation, Rule Number 1 is defined as outlined below:
(prev_error_condition && curr_error_condition) | max_custom_length_condition (7) where x && ydenota boolean logic and x and y, x | ydenote a boolean or x and y logic, prev_error_condition represents a predicate indicating whether the last adjustment error is less than an applicable error threshold (for example, a standard error threshold t, an adjusted error threshold 0.75t, an adjusted threshold additional, etc.); curr_error_condition represents a predicate indicating whether the current adjustment error is less than the applicable error threshold; and max_custom_length_condition represents a
Petition 870170063697, of 8/29/2017, p. 10/40
31/67 predicate indicating whether a segment has reached a predefined maximum length.
[083] In an exemplary implementation, Rule Number 2 is defined as follows:
(curr_error_condition && min_custom_length_condition) | max_custom_length_condition (8) where min_custom_length_condition is a predicate indicating whether a segment has reached a predefined minimum length.
[084] In an exemplary implementation, Rule Number 3 is defined as outlined below:
curr_error_condition && min_custom_length_condition (9) [085] In some modalities, forward search and reverse search can use the same set of stop rules. In some other modalities, the forward search and the reverse search may use different sets of stop rules. In some modalities, as shown in block 206 of Figure 2, when the continuity condition is NOT fulfilled, Rule No. 1 is selected for both the direct and inverted search. In some modalities, as shown in block 208 of Figure 2, when the continuity condition is NOT met, Rule No.1 is selected for the direct search, while Rule No. 2 is selected for the inverted search.
[086] In block 210, the LUT approach module performs the approximation of the target LUT with a direct search algorithm. The results of the approximation of the target LUT with the direct search algorithm can be saved in memory in block 212.
[087] In block 214, the LUT approach module performs the approximation of the target LUT with an inverted search algorithm.
[088] In some modalities, the approximation of a target LUT in the direct search algorithm or in the inverted search algorithm includes the following steps. After
Petition 870170063697, of 8/29/2017, p. 41/108
T 32/67 an initial error threshold is set for one or more algorithms, polynomials of order 2 are set to sections of a reshaping function or an inverse thereof, as represented by LUT target. The adjustment of the polynomials 2 to order the segments can be carried out segment by segment, for example, from left to right, the direct search algorithm, or from right to left in reverse search algorithm. Each segment can be determined or chosen in such a way that the adjustment error between that segment and a corresponding approximation polynomial does not exceed the error threshold t. If the number of segments is equal to a maximum number determined for the number of polynomials of second order, then fit the curve reaches the end successfully. On the other hand, if the number of segments is less than the maximum number determined for the number of second order polynomials, then the error threshold T (e.g. T = 0,75t, etc.) is reduced; the steps mentioned above are repeated for the reduced error threshold, until the number of segments is equal to the maximum number given for the number of polynomials of second order.
[089] In block 216, the LUT approach module determines whether the approach of the target LUT with the inverted search algorithm generates a greater error than the approach of the target LUT with the direct search algorithm.
[090] In response to the fact that the approach of the target LUT with the inverted search algorithm generates a larger error than the approach of the target LUT with the direct search algorithm, in block 218, the LUT approach module chooses the approximation of the target LUT with the direct search algorithm as the approximation of the target LUT (eg final, etc.).
[091] In response to the fact that the approach of the target LUT with the inverted search algorithm does not generate a larger error than the approach of the target LUT with the direct search algorithm, in block 220, the LUT approach module chooses the approximation of the target LUT with the inverted search algorithm as the approximation of the target LUT (eg final, etc.).
Petition 870170063697, of 8/29/2017, p. 42/108
33/67 [092] In block 222, the LUT approach module terminates operations to approximate the target LUT.
[093] It has been described that an approximation of a target LUT, which represents or is generated based on an arbitrary remodeling function, can be performed with a direct search followed by an inverted search. This is for illustration purposes only. In various modalities, an approximation of a target LUT, which represents or is generated based on an arbitrary remodeling function, can be performed with a search, like a direct search, but not with an inverted search, like an inverted search but not with a direct search. In addition, optionally and alternatively, an approximation of a target LUT, which represents or is generated based on an arbitrary remodeling function, can be performed with an inverted search followed by a direct search. Therefore, these and other variations of approximation of a target LUT, which represents or is generated based on an arbitrary remodeling function, can be used according to the techniques described here.
[094] Figure 3A illustrates an exemplary processing flow to approximate a target LUT. One or more computing devices, one or more modules, at least partially implemented in the hardware on a computing device, etc., can perform this method. For example, a LUT approximation module in a video processing device, such as an adaptive remodeling module in a video encoder, a reverse mapping module in a video decoder, etc., can perform part of or all of the processing flow of Figure 3A.
[095] In block 302, the processing flow begins with the initialization of one or more variables related to a pivot, processing indicators, variables related to convergence, such as a maximum number (for example, 20, 30, 40, 50, 60 , etc.) or iterations in an external circuit, an upper limit of search error t (for
Petition 870170063697, of 8/29/2017, p. 43/108
34/67 example, initialized to the maximum error threshold t, etc.), a sign (designated as found_CC) to indicate whether a valid approximation is found in an iteration (for example, initialized to false, etc.), etc. The maximum number of iterations, the upper limit of the search error t, etc., are convergence restrictions that aim to minimize distortions, and that can assume different values (for example, a configurable, user-replaceable system, tunable based on statistics collected with a set of training images, etc.) in various modalities. In some modalities, the greater the number of iterations, the more the processing flow will try to minimize the distortion when approaching the remodeling function. In some modalities, the upper limit of search error t provides a limit for the distortion, where the approximation of the remodeling function is converged (for example, with iterations at or below the maximum number of iterations, etc.).
[096] In block 304, the processing flow enters the external circuit with a convergence_iter convergence flag that was initialized as false (0) in block 302. More specifically, in block 304, it is determined whether the convergence_iter convergence flag is set to true (1).
[097] In response to the fact that the convergence_iter convergence flag is set to true, in block 306, the processing flow returns to the pivot point, the best coefficient sets for the polynomials that approach the target LUT. The processing flow ends at block 308.
[098] In response to the fact that the convergence_iter convergence flag is set to false, the processing flow resets one or more internal circuit parameters, as an internal circuit flag converts to false, the found_CC flag to false, a variable num_pivot (for example, indicating the current number of pivots, etc.) for one (1), etc., and enters an internal circuit with the internal circuit flag converges reset to false (0). More specifically, at block 310, the processing flow determines whether the internal circuit flag
Petition 870170063697, of 8/29/2017, p. 44/108
35/67 converge is set to true (1).
[099] In response to the fact that the internal circuit flag converges is set to false, the processing flow initializes a found_one_seg processing flag to false (0), initializes an iter_cnt internal circuit iteration count to one (1) , and proceeds to perform a second processing flow to add a new segment that is approximated by a polynomial, and proceeds to block 332 of the second processing flow illustrated in Figure 3B. The second processing flow is repeated until the internal circuit signal converges to be set to true in block 310.
[0100] In response to the fact that the internal circuit flag converges is set to true, at block 314, the processing flow determines whether the number of iterations in the external circuit, as indicated by an external circuit iteration counter Out_iter_cnt, exceeds a maximum number of iterations of the external circuit, as indicated by a configured value (for example, constant, etc.) MAX_OUT_CNT.
[0101] In response to the fact that the number of iterations in the external circuit exceeds the maximum number of iterations in the external circuit, in block 316, the convergence_iter convergence flag is set to true (1), and the processing flow proceeds to block 304.
[0102] In response to the fact that the number of iterations in the external circuit does not exceed the maximum number of iterations in the external circuit, in block 318, the processing flow determines whether the number of pivots (for example, current, etc., ), as indicated by the variable num_pivot, exceeds a maximum number of pivots (for example, a maximum of nine pivots if the maximum number of polynomial pivots is eight, a maximum of ten pivots if the maximum number of polynomials is nine, etc.). ), as indicated by a configured value (for example, constant, etc.) MAX_NUM_PIVOT.
Petition 870170063697, of 8/29/2017, p. 45/108
36/67 [0103] In response to the fact that the number of pivots does not exceed the maximum number of pivots, in block 320, the upper limit of search error, designated as max_error, is reduced (for example, max_error = 0, 75 max_error, etc.), and the processing flow goes to block 304.
[0104] In response to the fact that the number of pivots exceeds the maximum number of pivots, in block 316, the convergence_iter convergence flag is set to true (1), and the processing flow goes to block 304.
[0105] Figure 3B illustrates a second example processing flow that can be invoked in block 310 of Figure 3A, in response to the fact that the convergence flag is false. One or more computing devices, one or more modules, at least partially implemented in hardware on a computing device, etc., can perform this method. For example, a LUT approximation module in a video processing device, such as an adaptive remodeling module in a video encoder, a reverse mapping module in a video decoder, etc., can perform part of or all of the processing flow of Figure 3B.
[0106] In block 332, the second processing flow determines whether the processing flag found_one_seg is set to true (1). In response to the fact that the found_one_seg processing flag is set to false, the second processing flow sets the lower limit and the upper limit for the candidate segment, and then proceeds to block 346.
[0107] In some embodiments, adjusting the lower limit and the upper limit for a candidate segment includes adjusting the horizontal distance between the lower limit and the upper limit, so that the candidate segment is the current value of the circuit iteration count internal iter_cnt. As a result, in situations with multiple internal circuit iterations, as long as the found_one_seg flag remains false (0), the candidate segment increases in length, conPetition 870170063697, of 8/29/2017, p. 46/108
37/67 form the iter_cnt inner loop iteration count is incremented from one inner iteration to the next.
[0108] On the other hand, in response to the fact that the found_one_seg processing indicator is set to true, the second processing flow goes to block 334. However, before moving from block 334 to block 332, the second flow processing rate first determines whether the number of pivots, as indicated by the variable num_pivot, exceeds the maximum number of pivots represented by the configured value MAX_NUM_PIVOT. In response to the fact that the number of pivots exceeds the maximum number of pivots, the found_CC flag is set to false (0), and the internal circuit flag converges is set to true (1).
[0109] In block 334, the second processing flow proceeds to determine whether the found_CC flag is set to true (1). In response to the fact that the found_CC flag is set to false (0), the second processing flow returns to block 310 in the processing flow of Figure 3A. On the other hand, in response to the fact that the found_CC flag is set to true (1), in block 336, the second processing flow determines whether a search direction value indicates a direct search or an inverted search. In response to the fact that the search direction value indicates an inverted search, in block 338, the second processing flow reorders or inverts the records in a generated list of pivot points, the records in a list of coefficients, etc. , and then proceeds to block 340. In response to the fact that the search direction value indicates a direct search, the second processing flow proceeds directly to block 340, without reordering or inverting the records in the generated list of pivot, the records in the coefficient list, etc.
[0110] In block 340, the second processing flow reconstructs a generated LUT, based on polynomials defined by the records contained in the generated list of
Petition 870170063697, of 8/29/2017, p. 47/108
38/67 pivot points, the records contained in the list of coefficients, etc., and calculates a max_diff maximum error between the target LUT and the generated LUT. In block 342, the second processing flow determines whether the max_diff maximum error is not greater than a previous maximum error better prev_best_max_error.
[0111] In response to the fact that the max_diff maximum error is no greater than a previous best error prev_best_max_error, the records contained in the generated list of pivot points, the records contained in the generated list of coefficients, etc., are saved as the best current pivot points, the best current coefficients, etc., and the second processing flow goes to block 310 in the processing flow of Figure 3A.
[0112] In response to the fact that the maximum error max_diff is greater than a previous maximum error better prev_best_max_error, the records contained in the generated list of pivot points, the records contained in the generated list of coefficients, etc., are not saved as the best current pivot points, the best current coefficients, etc., and the second processing flow goes to block 310 in the processing flow of Figure 3A.
[0113] In block 346, the second processing flow determines whether a continuity_condition continuity condition flag is set to false (0) or whether the variable num_pivot is one (1). When the continuity_condition flag is set to false, the continuity condition or continuity constraint is not met, as discussed earlier. When the continuity_condition flag is set to true (1), the constraint condition or continuity constraint is met, as discussed earlier.
[0114] In response to the fact that the condition is fulfilled and the variable num_pivot is not one, the second processing flow goes to block 348. In some modalities, the variable num_pivot was initialized up to one (1) before the flow of processing of Figure 3A enter the internal circuit represented by block 310 to
Petition 870170063697, of 8/29/2017, p. 48/108
39/67 starting from block 304, and then, inside the internal circuit, increased in increments of one, each time a new segment (for example, valid, etc.) approached by a polynomial is determined or selected. Thus, when the variable num_pivot is not one, the second processing flow in block 346 is dealing with an iteration in which at least one segment, approximated by at least one polynomial, has been determined / selected.
[0115] In some modalities, a LUT, such as a target LUT, a non-target LUT, etc., which maps input code word values (for example, as keys in LUT key-value pairs, etc.) to values of mapped code words (for example, values in key-value pairs in the LUT, etc.) can be represented in a coordinate system (for example, a Cartesian coordinate system, etc.) in which the horizontal axis represents the values of the input code words and the vertical axis represents the values of the mapped code words. The last polynomial used to approximate the last segment of the target LUT can be used to calculate an entry code value in a next record in the target LUT (for example, a data point following after the last segment in the coordinate system, etc. ). A combination of the input code value in the next record and its calculated mapped codeword value, corresponding to the last polynomial, can be used as a reference point (pivot) for the candidate segment, for the purpose of fulfilling the condition or continuity restriction.
[0116] In block 348, the second processing flow centralizes a current segment (for example, the candidate segment, etc.) in relation to the reference point (pivot) that is generated, at least in part, from the last polynomial approaching the last segment. The centralization of the candidate segment in block 348 can be accomplished by transforming (for example, conversion, etc.) the coordinate system into a new coordinate system, in which one or both of the new horizontal and vertical coordinate values of the reference is zero. In block 350,
Petition 870170063697, of 8/29/2017, p. 49/108
40/67 the second processing flow determines if the number of data points in the candidate segment is not less than three (3). In response to the fact that the number of data points in the candidate segment is not less than three (3), in block 352, the second processing flow generates a polynomial of 2 order to approximate the candidate segment based on data points in the target segment of the target LUT. In response to the fact that the number of data points for the candidate sector is less than three (3), at block 354, the second processing flow generates a polynomial of first order to bring the candidate segment based on data points at candidate segment of the target LUT. At block 356, the second processing flow analytically derived polynomial coefficients of a current (e.g., in the coordinate system before it becomes the new coordinate system, etc.) to bring the candidate segment or first order polynomial or the second order polynomial (for example, derived from the new coordinate system, etc.) to inversely transform the new coordinate system to the coordinate system back.
[0117] In block 346, in response to the fact that the continuity condition is not met when approaching the target LUT, or that the variable num_pivot is one, the second processing flow goes to block 358. The second processing flow performs similar operations in blocks 358, 360 and 362, as well as operations in blocks 350, 352 and 354.
[0118] As shown in Figure 3B, the second processing flow moves to block 364 from one of blocks 356, 360 or 362. In block 364, the second processing flow determines whether the upper limit of the current segment, designated as the next pivot point, it is the last possible (data) point or last record in the target LUT. In response to the fact that the upper limit of the current segment is the last (data) point or the last possible record in the target LUT, the second processing flow ends the current segment, sets the found_one_seg flag to
Petition 870170063697, of 8/29/2017, p. 50/108
41/67 true (1), the flag converges to true (1), the found_CC flag to false (0) etc., and proceeds to block 332.
[0119] In response to the fact that the upper limit of the current segment is not the last possible (data) point or record in the target LUT, the second processing flow determines whether the current segment (or the candidate segment) satisfies the rule stop (for example, zero, one or more within Rule 1, Rule 2, Rule 3, etc.) defined for approach operations. In response to the fact that the current segment (or candidate segment) satisfies the stop rule, in block 372, the second processing flow increases the variable num_pivot by one, sets the flag found_one_seg to true (1), the flag found_CC for false (0), etc., and proceeds to block 332. On the other hand, in response to the fact that the current segment (or the candidate segment) does not satisfy the stop rule, in block 372, the second processing flow set the found_one_seg flag to false (0), the found_CC flag to false (0), and proceed to block 332.
[0120] Before the second processing flow goes to block 332 from any of blocks 368, 372 and 374, the iter_cnt internal circuit iteration count is increased by one. When the found_one_seg flag remains false, as in the case of block 374, the candidate segment length for the next iteration will increase as the iter_cnt inner circuit iteration count is increased at the beginning of the next iteration. The longest candidate segment will be approximated in the next iteration by a polynomial in blocks 352, 354, 360, or 362, as discussed above.
[0121] Adaptive remodeling according to the techniques described here can be performed with one or more of a variety of remodeling or inverse functions, LUTs representing analytical or non-analytical functions, etc. In the case of the use of power functions, the techniques can be used to specifically select the adapted remodeling parametersPetition 870170063697, of 29/08/2017, p. 51/108
42/67, such as exponential values for power functions, etc., to improve perceptual quality.
[0122] In some embodiments, video content such as source video content, intermediate video content, output video content, etc., can be encoded in a non-linear color space, such as a color space perceptually quantized (PQ), etc. The PQ color space can comprise a set of PQ code words available to encode video content. The different PQ code words in the PQ color space may not be scaled linearly with luminance values, but may correspond to variable quantization steps in luminance values. For example, the PQ color space can allocate more code words in regions with a dark luminance value and less code words in regions with a light luminance value. Some examples of PQ color spaces, transformations, mappings, transfer functions, etc., are described in SMPTE ST 2084: 2014 High Dynamic Range EOTF of Mastering Reference Displays, which is fully incorporated by reference.
[0123] The techniques described here can be used for adaptive modeling of the input video content (for example, source video content, PQ encoded video content, video content with a relatively large set of available code words ), refurbished video content encoded with a remodel function (for example, a direct power function, etc.) comprising a relatively limited set of refurbished code words that can then be transmitted in a relatively deep video signal low.
[0124] A relatively limited set of refurbished code words, as described here, can distribute more code words in regions with high luminance values for relatively smooth clear images, relatively smooth clear scenes, etc., such as images that span large areas with features
Petition 870170063697, of 8/29/2017, p. 52/108
43/67 clear visuals (for example, a sky, windows, polished metal, airplanes, cars, etc.). The distribution of more code words in regions with high luminance values in the remodeling function reduces or avoids visual artifacts, such as color bands, etc., in those images, scenes, etc. In some modalities, relatively large exponential values can be selected for remodeling functions (for example, direct, etc.) represented by direct power functions. A relatively large exponential value (for example, 1.6, 1.7, a relatively large value in which artifacts such as ranges are eliminated, etc.) for a direct power function provides relatively more code words for the remodeling of parts of the video content representing clear areas. On the other hand, the inverse (for example, inverse power functions, etc.) (for example, direct, etc.) of the remodeling functions can use relatively small exponent values that are the inverse of the exponent values in power functions direct, with the purpose of reconstructing a version of the pre-remodeled video content based on an adaptively modeled video signal, modeled with the direct power functions, with relatively large exponent values.
[0125] A relatively limited set of refurbished code words, as described here, can distribute more code words in regions with low luminance values for relatively dark images, relatively dark scenes, etc., such as images that comprise large dark visual characteristics (for example, shadows, a starry sky, nights, low light indoors etc.). The distribution of more code words in regions with low luminance values in the remodeling function helps to preserve details or characteristics of the image in those images, scenes, etc. In some modalities, relatively small exponential values can be selected for remodeling functions (for example, direct, etc.) represented by direct power functions. A relatively large exponential value (for example, 1.6, 1.0, a relatively large value
Petition 870170063697, of 8/29/2017, p. 53/108
44/67 large in which artifacts such as tracks are preserved, etc.) for a direct power function provides relatively more code words for reshaping parts of video content representing bright areas. On the other hand, the inverse (for example, inverse power functions, etc.) (for example, direct, etc.) of the remodeling functions can use relatively high exponent values that are the inverse of the exponent values in power functions direct, with the purpose of reconstructing a version of the pre-remodeled video content based on an adaptively modeled video signal, modeled with the direct power functions, with relatively small exponent values.
[0126] For other images, scenes, etc., which represent the majority of objects or features in halftone, the remodeled code words can be more evenly distributed in terms of luminance values. For example, a direct power function can use an exponent value of 1.3, 1.4, etc.
[0127] It should be noted that the exponential values mentioned are for illustration purposes only. When power functions are used in at least part of a remodeling function, these and other exponential values can be adopted based on a variety of factors, including, but not limited to, image types, etc. In addition, optionally or alternatively, functions, ratios, etc., other than the power functions can be used in at least part of a remodeling function. For these other functions, the mentioned exponent values and other exponent values can be adopted.
[0128] One or more of a variety of processing algorithms can be implemented, according to the techniques described in the present invention, to determine whether an image, scene, etc., has potential clear areas prone to generating artifacts such as contours / stripes in a reconstructed version of
Petition 870170063697, of 8/29/2017, p. 54/108
45/67 image, scene, etc. In some modalities, each image in the image, scene, etc., can be divided into multiple non-overlapping NxN blocks (comprising NxN pixels, where N is a positive integer, such as 2, 4, 6, 8, 16, etc.) . In each of some or all of the blocks, the minimum, maximum, average, etc., within the block can be determined / calculated. Figure 4A illustrates an exemplary algorithm for determining whether that image (or frame), scene, etc., comprises smooth light areas. The difference between the maximum and minimum values in a block can be calculated and compared with a difference threshold (designated as Te). If the difference is less than the threshold (Te), the block can be classified as a soft block potentially prone to artifacts as bands. In addition, optionally or alternatively, to quantify or identify light areas, the mean value of the block can be compared with a mean value threshold (designated as Tb). If the average value exceeds the threshold (Tb), the block can be classified as a clear block. The number of smooth light blocks in an image, scene, etc., can be determined based on the preceding analysis of the image, scene, etc. If the number of smooth light blocks in an image constitutes more than a certain percentage (referred to as Pb) of the total number of blocks, then the image (or picture frame) can be considered as an image with smooth light areas.
[0129] One or more of a variety of image processing algorithms can be implemented, according to the techniques described here, to determine whether an image (or frame), scene, etc., has relatively large dark areas. Figure 4B illustrates an example of such an algorithm. In some embodiments, the total number of dark pixels is determined - among some or all of the pixels in an image, scene, etc. - which have luminance values less than a luminance threshold (designated as Td). If the total number of dark pixels is greater than the percentage threshold (designated as Pd), the image, scene, etc., is classified as having large dark areas.
Petition 870170063697, of 8/29/2017, p. 55/108
46/67 [0130] Figure 4C illustrates another example to determine whether an image (or frame), scene, etc., has relatively large dark areas. A difference between the maximum and minimum values in a block can be calculated. If the difference is 0, or alternatively, in some modalities, less than a small difference threshold, the block can be classified as a pure black block and ignored. On the other hand, if the difference is different from zero, or alternatively, in some modalities, not less than a small difference threshold, the block is classified as a non-pure black block. The average value of the block can be compared with a second average value threshold (designated as Ts). In addition, optionally or alternatively, a standard deviation value in the block can also be calculated and compared with a standard deviation threshold (referred to as Tstd). If the mean value is less than the second mean value threshold (Ts) and the standard deviation value is less than the standard deviation threshold (Tstd), the block can be classified as a smooth dark block. A smooth dark area can then be identified as an area comprising at least a certain number (for example, eight, ten, sixteen, a different positive integer greater than two, etc.) of smooth dark blocks that are connected within or between a number of smooth dark blocks. The largest soft dark area in an image can be determined from zero, one or more soft dark areas identified from the image. If the number of soft dark blocks in the largest soft dark area of the image can contain more than a certain percentage (referred to as Pbd) of the total number of blocks, then the image (or image frame) can be considered as an image with dark areas smooth.
[0131] In some embodiments, a codec, as described here, can pre-compute a plurality of candidate LUTs in different sets of adaptive remodeling parameters. For example, in modalities in which the remodeling functions are based on power functions, each candidate set of adaptive remodeling parameters can include an upper limit of a range
Petition 870170063697, of 8/29/2017, p. 56/108
47/67 dynamics supported by a source video signal, a lower limit of the dynamic range supported by the source video signal, an exponential value, etc .; a corresponding candidate LUT can be pre-calculated based, at least in part, on that candidate set of adaptive remodeling parameters. Likewise, a candidate LUT that represents an inverse of a candidate remodeling function can be pre-calculated based on a candidate set of adaptive remodeling parameters. Additional, optionally or alternatively, one or more candidate sets of polynomials, along with one or more sets of candidate polynomial coefficients, pivots, etc. corresponding, can be pre-calculated to approximate one or more target LUTs that can represent a remodeling function and an inverse of it. Some or all of those previously mentioned as pre-calculated can be saved in memory. For example, alpha candidate values to be calculated may include, but are not limited to, any of the following: 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, etc. Note that an alpha value of 1.0 may not need a LUT, since a remodeling function in this case is a linear function. In some embodiments, candidate LUTs may include LUTs with different quantization steps (for example, represented by pairs of neighboring code word values, etc.). During runtime, when an image is being reshaped in an encoder or mapped in reverse to a decoder, based on calculated statistics based on the actual content of the image, a candidate LUT (or candidate polynomials with candidate polynomial coefficients, pivots, etc.) can be selected from a plurality of LUTs, such as LUT (or polynomials with polynomial coefficients, pivots, etc.) in order to apply an adaptive remodeling or corresponding inverse mapping.
[0132] In some modalities, some or all of the polynomial coefficients, pivots, etc., according to the techniques described here, can be expressed
Petition 870170063697, of 8/29/2017, p. 57/108
48/67 in a certain number of bytes, such as four bytes (for example, 1 byte for an integer and 3 bytes for a fractional point, etc.).
[0133] LUT (or polynomials with polynomial coefficients, pivots, etc.) can be adaptively determined from image to image, from scene to scene, etc.
6. Example of processing flows [0134] Figure 5A illustrates an exemplary processing flow. In some embodiments, one or more computing devices or components can perform this processing flow. In block 502, a video encoder (for example, 102 in Figure 1A) receives a sequence of source images; the video encoder (102) calculates one or more statistical values based on one or more source images in a sequence of source images.
[0135] In block 504, the video encoder (102) selects, based on one or more statistical values, an adaptive remodeling function for the one or more source images.
[0136] In block 506, the video encoder (102) remodels adaptively, based, at least in part, on the selected adaptive remodeling function, a portion of the source video content to generate a portion of video content refurbished, with the source video content portion being represented by one or more source images.
[0137] In block 508, the video encoder (102) generates an approximation of an inverse of the selected adaptive remodeling function.
[0138] In block 510, the video encoder (102) encodes the remodeled video content and a set of adaptive remodeling parameters that define the approximation of the inverse of the adaptive remodeling function selected in a remodeled video signal.
[0139] In one embodiment, the remodeled video content portion comprises one or more remodeled images.
Petition 870170063697, of 8/29/2017, p. 58/108
49/67 [0140] In one mode, the one or more source images form a scene.
[0141] In one embodiment, the one or more statistical values include at least one of a maximum value, a minimum value, an average value, a median value, an average value, a standard deviation value, etc., as determined based on the source code words in one or more source images.
[0142] In one modality, at least one of the selected adaptive remodeling function or the inverse of the selected adaptive remodeling function comprises one or more among analytical functions, non-analytical functions, research tables (LUTs), sigmoid functions, functions of power, piecewise functions, etc.
[0143] In one modality, the approximation of the inverse of the selected adaptive remodeling function is represented by a set of polynomials.
[0144] In one embodiment, a total number of polynomials in the set of polynomials is limited below a numerical threshold.
[0145] In one modality, coefficients for polynomials in the set of polynomials are determined based on minimizing differences between the values given by the polynomials and the values given in a target lookup table (LUT) that represents the inverse of the adaptive remodeling function selected.
[0146] In one embodiment, the video encoder (102) is additionally configured to select a continuity condition, to generate the set of polynomials based on a type of function determined for the inverse of the selected adaptive remodeling function.
[0147] In one modality, the set of polynomials is predetermined before the one or more source images are processed for adaptive remodeling.
[0148] In one modality, the set of polynomials is determined dynamically while the one or more source images are being processed to
Petition 870170063697, of 8/29/2017, p. 59/108
50/67 adaptive remodeling.
[0149] In one embodiment, the video encoder (102) is additionally configured to classify the source image (s) as one of the images comprising soft light areas, images comprising soft dark areas, or halftone images.
[0150] In one embodiment, the source video content portion is adaptively reshaped in the video content portion reshaped to one or more channels in a plurality of channels in a color space.
[0151] In one mode, the one or more channels include a channel related to luminance.
[0152] In one embodiment, the refurbished video signal is one of an 8-bit video signal on two channels or a 10-bit video signal on a single channel.
[0153] In one modality, the remodeled video signal is generated by at least one among: an encoder for advanced video encoding (AVC), an MPEG-2 encoder (Moving Picture Experts Group - group of specialists in images with movement ), or a HEVC (High Efficiency Video Coding) encoder.
[0154] In one embodiment, the source image sequence is perceptually encoded.
[0155] In one embodiment, the source video content portion is adaptively remodeled in the remodeled video content portion without using any approximation of the selected adaptive remodeling function.
[0156] Figure 5B illustrates an exemplary processing flow. In some embodiments, one or more computing devices or components can perform this processing flow. In block 552, a video decoder (for example, 152 in Figure 1A) retrieves the remodeled video content and a set of adaptive remodeling parameters related to an inverse of a
Petition 870170063697, of 8/29/2017, p. 60/108
51/67 adaptive remodeling function from a remodeled video signal, and the remodeled video content is generated by an upstream device based, at least in part, on the adaptive remodeling function.
[0157] In block 504, the video decoder (152) maps, in reverse and based, at least in part, on the reverse of the adaptive reshaping function, a portion of the remodeled video content to generate a portion of content from reconstructed video.
[0158] In block 506, the video decoder (152) generates, based, at least in part, on the portion of reconstructed video content, a sequence of reconstructed images, with the sequence of reconstructed images representing a reconstructed version of a sequence of source images used by the upstream device to generate the refurbished video content.
[0159] In one embodiment, the video decoder (152) is also configured to display the plurality of reconstructed images on a display system.
[0160] In one embodiment, the video decoder (152) is also configured to: establish an approximation of a target lookup table (LUT) that represents the inverse of the adaptive modeling function, based, at least in part, on the set of adaptive remodeling parameters related to an inverse of an adaptive remodeling function from a remodeled video signal; etc.
[0161] In several exemplifying modalities, an encoder, a decoder, a transcoder, a system, an apparatus, or one or more among other computing devices performs any of or a part of the methods mentioned above, as described. In one embodiment, a non-transient, computer-readable storage medium stores software instructions that, when executed by one or more processors, lead to the execution of a method, as described here.
Petition 870170063697, of 8/29/2017, p. 61/108
52/67 [0162] Note that, although separate modalities are discussed in the present invention, any combination of modalities and / or partial modalities discussed herein can be combined to form additional modalities.
7. Real-time optimizations [0163] As previously discussed, the method for generating a piecewise approximation of a remodeling function using second order polynomials can be summarized as follows: Given a starting point, a search is performed to identify a segment along the remodeling curve. If a set of termination conditions is satisfied, a pivot point is then established and a new search is initiated until the entire curve has been segmented. This process can be subdivided as follows: (1) calculate the polynomial coefficients for each segment for candidate pivot points, (2) determine an approximation error for each candidate, and (3) if the approximation error is less than target threshold, then declare the segment as valid and define a pivot point for that segment. (4) When all segments have been identified, the process can be repeated using a lower target threshold to improve the accuracy of the approximation polynomials.
[0164] From a computational point of view, step (1) takes the longest, followed by step (4). As the inventors are aware, some modalities may require real-time implementation, even at the cost of potential reduction in accuracy. In this section, several improvements are presented for these real-time implementations. The improvements can be divided into two classes: a) a faster method for calculating the polynomial coefficients within each segment, without loss of precision, and b) a faster method for converging, that is, reducing the total number of iterations needed to identify the best approach segments (according to some criteria).
Polynomial fast solution without continuity restriction [0165] Consider, without loss of generality, the p-th segment of a
Petition 870170063697, of 8/29/2017, p. 62/108
53/67 approximate curve by a second order polynomial V p, i = m p, 0 + m p, 1 · S p, i + m p, 2 · (S p, i ) 2 , (1 0) where sp , i denotes the i-th remodeled pixel value of the standard (or lower) dynamic range (SDR or LDR) corresponding to the p-th polynomial segment. Let the video be the corresponding colocalized original high dynamic range (HDR) pixel value. Let the v pi be the predicted co-located HDR pixel. Let the p-th segment of the polynomial have a range of values of
SDR from S p , l (L) to Sp, H (H), where L is the low pivot index and H is the high pivot index in the current p-th segment (for the sake of simplicity, the p index is removed L and H). In one embodiment, the parameters (mp, j, for j = 0, 1, and 2) of the polynomial in equation (10) can be calculated using the least squares solution as follows.
0166] Equation (10) can be rewritten in the form of a matrix vector, such as:
V pL ''1 S p, L s 2 Ί ò pL V p.L + 11 S p, L + 1 s 2 2nd p, L + 1 m p 0 V p, L + 2 = 1 S p, L + 2 s 2 s p, L + 2 m p1 m p 2 V p, H _1 S p, H s 2 s p, H _
or V p = S p M p, (11) with a solution of least squares given by mp = ((S p) T (S p)) -1 ((S p) T vp), where ' V p, L V p, L + 1 V p, L + 2 V p, H denotes a vector of the original HDR pixels. To facilitate the discussion, a Bp matrix and an ap vector are also defined as:
Bp = (Sp) T (S p), (12)
Petition 870170063697, of 8/29/2017, p. 63/108
54/67 and
a p = (S p ) T V p , (13)
on what^ p, 00 b p, 01 b p, 02 B p = ^ p, 10 b p, 11 b p, 12_ b p, 20 b p, 21 b p, 22
= (S p) T (S p) S p, L S p, L S p, L + 1 S p, L + 1 S p, L + 2 S p, L + 2
1 s p, H S p, H S p, L S p, L + 1 S p, L + 2 s 2, l s 2 s p, L + 1 S p, L + 2 S p, H p, H [0167] For the Bp matrix, each of its elements can be calculated as:
H b ,. «, = Σ1 i = L b p, 01
H = b p, 10 = Σ S p, í i = L
H b p, 20 = b p, 11 = b p, 02 = Σ (s p, i) i = L (14)
H b p, 21 = b p, 12 = Σ (s p, i) i = L
Bp.22 M = Σ (SPI) 4 i = L [0168] For the vector p, as p, p 0, p 1, 2 = (S p) T Vp
1 1 1 S p, L s 7 S p, L + 1 s 2 S p, L + 1 S p, L + 2 s 2 S p, L + 2
S p, H p, H V p, L V p, L + 1 V p, L + 2 each element can be calculated as
Petition 870170063697, of 8/29/2017, p. 64/108
55/67 a p, 0 = Σ V p, i to P .1 Σ (ípjVpJ ), (15) a p.2 = Σ ((S P .i Ι ' V PJ) [0169] From the equation ( 14), the calculation of the elements of the Bp matrix requires the calculation of the factors:
Pi st pi, k = 0, 1,2, 3, 4. (16) [0170] However, the values s P i are finite, that is, within [0.1] if normalized, or within [0 , 2 depth / deb / t-1] if not normalized, therefore, in one mode, the calculation time to solve equation (11) can be improved with the use of pre-calculated research tables (LUTs), as described below.
[0171] Let e (beta) denote pre-calculated values from equation (16) as follows:
, (17)
Max.
Max.
Max.
Max.
where the indices /, j for β [/, ί] are in the range of / = [0, Max] and j = [0, 4], and Max denotes the maximum possible pixel value in a specific video sequence.
[0172] From the LUT above, for any p value between sl and sh, the matrix Bp can be calculated as follows bp, 00 = H - L + 1 b p, 01 = b p, 10 = Ph, 1 - Pl-1,1 b p, 20 = b p, 11 = b p, 02 = Ph, 2 - Pl-1,2 (1 8) b p, 21 = b p, 12 = P H, 3 - P L -1.3 b p, 22 = P H, 4 - P L-1.4 [0173] When the lowest pivot is 0, you can substitute L = 0 in equation (14)
Petition 870170063697, of 8/29/2017, p. 65/108
56/67 and calculate the matrix Bp using b p , 00 = H +1 b p, 01 - b p, 10 - β Η, 1 b p, 20 - b p, 11 - bp, 02 - β Η, 2 ( 19 ) b p, 21 - b p, 12 - β Η, 3 b p, 22 - β Η, 4 [0174] Following the same process for beta LUT, one can also define an alpha LUT (α) (Max + 1) x 4 as follows
V o 5 0 V 0 5 0 V 0 S
2 33% +% SoVo + SjVj 5 0 V 0 +5 1 V 1 * 0 ^ 0
2 2 3 3 3 z Z = V 0 + V 1 + V 2 S 0 V 0 + 5 ^ + S 2 V 2 5 o v o + 5i Vi +5 2 v 2 5 0 V o + + S 2 V 2 (20)
Max Max Max Max
[0175] Then, the elements of the vector ap can be calculated in terms of LUT alpha as follows:
to p, 0 - to H, 0 - to L-1.0 to p, 1 - to H, 1 - to L-1.1 (21) to p, 2 to H, 2 to L-1,2 [ 0176] For L = 0, equation (21) can be simplified as a p, 0 - to H, 0 to p, 1 - to p, 10 - to H, 1 (21 b) to p, 2 - to H , 2 [0177] In summary, Table 1 shows, in pseudocode, the steps for a faster calculation of polynomial coefficients:
Table 1: Quick solution for polynomial parameters, without continuity restriction
1. At the sequence level, create a LUT β
2. In each frame, create an LUT α
Petition 870170063697, of 8/29/2017, p. 66/108
57/67
3. For each segment.
The. Calculate the Bp matrix using equations (18) or (19)
B. Calculate the vector ap using equations (21) or (21 b)
ç. Solve m p = (B p ) -1 to p
Fast polynomial solution with continuity constraint [0178] Under the continuity constraint, the starting point of the p-th polynomial is forced to be connected to the ending point of the p-1-th polynomial, so that the connected parts are continuous . Let the Sp-i denote the value of
Final SDR for the p-1-th segment. Then, the corresponding predicted HDR value can be calculated as follows:
Λ Z 2 v p-1 = m p-1,0 + m p-1,1 · Sp-1 + m p-1,2 ( Sp-1). ( 22 ) [0179] Under the constraint that if s pi = s p-1 there must be a single predicted value, v p-1 , the prediction polynomial can be derived as follows:
(v pi - v p-1) = mt p, 1 (s p, i - s p-1) + mt p, 2 (s p, i - s p-1 ) 2 (23) which, as previously described , can be expressed as:
m P , l m P , 2 ^ p ^ ((S p } T (Sp} r (Sp} T Np} (24) [0180] Note that, during a direct segment search, the p-1-th segment is before the p-th segment, so the end point of the p-1 segment must coincide with the start point of the p-th segment; however, during a reverse segment search, the p-1-th segment is finds after the p-th segment, so the start point of the p-1 segment must coincide with the end point of the p-th segment.
[0181] Given equation (24), following a similar approach as described above, it can be proved that a solution to this equation can be derived by following the steps in Table 2.
Table 2: Quick solution for polynomial parameters with continuity constraint
Petition 870170063697, of 8/29/2017, p. 67/108
58/67
1. At the sequence level, create a LUT β
2. For each frame, at each segment approach start, create
to 1 Ό ’ _ to 2 __0_
3. Direct approximation • For the current p-th segment, let L denote the known low point • For i = L, ..., H • Calculate the matrix Bp, b p, 11 = β Η, 2 - β .-1,2 b p, 12 = bp, 21 = β Η , 3 - β -1,3 bp, 22 = β Η, 4 - β.-1,4 • Calculate the vector ap, Let Hdenote a point final candidate. Let that v / , let the estimated HDR value of the end point of the previous segment, then θηονο ^ old + “Vi) S (Hi ~ L) ( V p, i ~ Tp-1) where v pi is the HDR value corresponding to the value SDR s pi , es (H — L ) is the (Hi-L) -th SDR value.
4. Reverse Approach • For the current th segment, let H denote the known high point.
• For i = H, ... L • Calculate the matrix Bp, b p, 11 = β Η, 2 - β —1.2 b p, 12 = b p, 21 = - (β Η, 3 - β . —1.3) b p, 22 = β Η, 4 - β —1.4
Petition 870170063697, of 8/29/2017, p. 68/108
59/67 • Calculate the vector a p ; let /./ denote a candidate starting point. Let Vp-i be the HDR value calculated from the starting point of the direct segment, and then update, θηονο ^ old + S (H-Li) ( V píi ^ p-1) where v p, t is the HDR value corresponding to the SDR value s pi es (H _ is the (/ - / - / _ /) - th SDR value.
5. m p = B p 1 to p
Adaptive termination conditions for faster convergence [0182] As previously described, the approximation algorithm works by continuously searching for polynomial segments. It starts at a starting point and then tries candidate end points along the curve; if the end conditions are met at any point on the curve, the algorithm defines that point as the end of the pivot and then starts a new search. As previously described, the termination conditions for any segment are:
Rule # 1:
(predictrorcondition && curr_error_condition) | max_custom_length_condition [0183] That is, or
i. there is a rising edge between two consecutive threshold detectors, that is, the current adjustment error is less than the error threshold, and an adjustment using one point less produces an error less than the threshold, or ii. the segment is longer than a predefined maximum length.
Rule # 2:
(currerrorcondition && mincustomlengthcondition) | max_custom_length_condition
Petition 870170063697, of 8/29/2017, p. 69/108
60/67 [0184] That is, or
i. the current adjustment error is less than the error threshold, and the segments must have at least a predefined minimum length, or ii. the segment is longer than a predefined maximum length.
Rule # 3:
curr_error_condition && min_custom_length_condition [0185] That is:
i. the current adjustment error is less than the error threshold, and ii. the segment meets a predefined minimum length.
[0186] In one mode, the error threshold starts from a fixed value and, at the end of each iteration, it is reduced k times (k <1, for example, k = 0.75). The search continues until a maximum number of iterations (n) is reached.
[0187] In one mode, given an original error threshold (ith), in each iteration the threshold is reduced by a fixed percentage (for example, ith = k * ith, k <1). Considering the worst case scenario, in which the optimal solution has zero error, with a reduction of error per k in each iteration, given n iterations, this strategy guarantees the production of a solution that is at least k n closer to the optimum. Table 3 presents an example method to adjust the target error according to another modality, which provides a faster suboptimal convergence.
Table 3: Adjustment of adaptive error threshold. Define an initial error threshold th (for example, th = 0.005).
2. Adjust the polynomials 2 to order the segments one by one, from left to right (forward lookup) or from right to left (reverse search); each segment is chosen in such a way that the adjustment error must not exceed the th threshold.
3. Keep a record of the minimum error found after each curve adjustment as eft.
4. Define a new target error as k * efit where k <1, (for example, k = 0.5)
Petition 870170063697, of 8/29/2017, p. 70/108
61/67 [0188] Higher k values have been found to improve error performance, but at the expense of speed.
[0189] It was also observed that, for some tables, the error converges to a constant value. So, proceeding with iterations adds no value in terms of error performance. This scenario can be avoided by adding one more termination condition;
Rule 4:
i. Finalize if the best error in the previous iteration (for example, in (t-1)) is equal (within a threshold) to the best error in the current iteration (for example, efit (t)) (for example, (t) - efit (t-1)) | <this2).
[0190] The added termination condition causes the algorithm to interrupt the search for additional iterations, if it is found that there is no improvement in the error threshold reduction performance. It was observed that the addition of this condition does not insert any significant visual artifact after the application of the inverse remodeling function in the image.
Construction of the LUT with inverted search [0191] A LUT with inverted search allows a decoder to perform reverse mapping, that is, map remodeled input values (SDR) to original HDR values. Since multiple HDR values can be mapped to a single SDR value, in one mode, without limitation, the average HDR value is selected. Table 4 describes in pseudocode the rapid construction of a remodeling LUT with inverted search.
______Table 4 - Construction of a remodeling LUT with inverted search // Let HDR_to_SDR be the direct LUT that converts HDR to SDR.
// Initialize the histogram set to (i = 0; i <SDRmax; i ++) hist [i] = 0; // SDRmax is the maximum SDR intensity // Initialize LUT with inverted search
Petition 870170063697, of 8/29/2017, p. 71/108
62/67 for (i = 0; i <SDRmax; i ++) SDR_to_HDR [i] = 0; // Form a histogram and a cumulative table For k = 0; k <HDRmax; k ++) {
sdr = HDR_to_SDR [k]; hist [sdr] ++;
SDR_to_HDR [sdr] + = k;
} // Use the histogram to update the LUT with inverted search for (sdr = 0; sdr <SDRmax; sdr ++) {SDR_to_HDR [sdr] = SDR_to_HDR [sdr] / hist [sdr]; // Use the average value}
8. Implementation mechanisms - hardware overview [0192] According to one modality, the techniques described here are implemented by one or more computing devices with special purposes. Special purpose computing devices can be electrically connected to perform the techniques, or they can include digital electronic devices, such as one or more application-specific integrated circuits (ASICs) or field programmable port arrays (FPGAs) that are persistently programmed to perform the techniques, or they may include one or more general-purpose hardware processors programmed to execute the techniques according to program instructions in a firmware, memory, other storage, or a combination thereof. Such special-purpose computing devices can also combine wired custom logic, ASICS, FPGAs with custom programming to execute the techniques. Special purpose computing devices can be
Petition 870170063697, of 8/29/2017, p. 72/108
63/67 desk, portable computer systems, portable devices, network devices, or any other device that incorporates wired logic and / or program logic to implement the techniques.
[0193] For example, Figure 6 is a block diagram illustrating a computer system 600, in which an exemplary embodiment of the invention can be implemented. Computer system 600 includes a bus 602 or other communication mechanism for communicating information, and a hardware processor 604 coupled to bus 602 for processing information. The hardware processor 604 can be, for example, a general purpose microprocessor.
[0194] The computer system 600 also includes a main memory 606, such as a random access memory (RAM) or other dynamic storage device, coupled to the 602 bus to store information and instructions to be executed by the 604 processor. The main memory 606 can also be used to store temporary variables or other intermediate information while executing instructions to be executed by the 604 processor. Such instructions, when stored on non-transitory storage media accessible to the 604 processor, make the computer system 600 an instrument for special purpose that is customized to perform operations specified in the instructions.
[0195] The computer system 600 also includes a read-only memory (ROM) 608 or other static storage device attached to the 602 bus to store static information and instructions for the 604 processor. A 610 storage device, such as a magnetic disk or optical disc, is provided and coupled to the 602 bus to store information and instructions.
[0196] Computer system 600 can be coupled via a 602 bus to a 612 screen, such as a liquid crystal display, to display information to a computer user. An input device 614, including alpha keys Petition 870170063697, of 8/29/2017, p. 73/108
64/67 numeric and others, is coupled to the 602 bus to transmit information and command selections to the 604 processor. Another type of user input device is the 616 cursor control, such as a mouse, trackball, or arrow keys. cursor to transmit information and command selections to processor 604, and to control the movement of the cursor on screen 612. This input device typically has two degrees of freedom on two axes, a first axis (for example, x) and a second axis (for example, y), which allow the device to specify positions on a plane.
[0197] The computer system 600 can implement the techniques described here using wired custom logic, one or more ASICs or FPGAs, firmware and / or program logic that, in combination with the computer system, causes that or programs the computer system 600 to be a special-purpose machine. According to one embodiment, the techniques of the present invention are performed by the computer system 600 in response to the execution, by processor 604, of one or more sequences of the one or more instructions contained in main memory 606. Such instructions can be read in memory main 606 from other storage media, such as storage device 610. Executing the instruction sequences contained in main memory 606 causes processor 604 to perform the processing steps described here. In alternative embodiments, a wired circuitry can be used instead of or in combination with software instructions.
[0198] The term storage media, for use in the present invention, refers to any non-transitory storage media and / or instructions that cause a machine to operate in a specific way. Such storage media may comprise non-volatile media and / or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as the
Petition 870170063697, of 8/29/2017, p. 74/108
65/67 storage 610. Volatile media includes dynamic memory, such as main memory 606. Common forms of storage media include, for example, a floppy disk, a floppy disk, a hard disk, a solid state drive, a tape magnetic, or any other magnetic data storage media, a CD-ROM, any other optical data storage media, any physical media with hole patterns, a RAM, a PROM, and an EPROM, a FLASH-EPROM, a NVRAM, any other chip or memory cartridge.
[0199] The storage medium is different from the transmission medium, but can be used in conjunction with it. The transmission medium participates in the transfer of information between storage media. For example, transmission media includes coaxial cables, copper wires and optical fibers, including wires that comprise the 602 bus. The transmission media can also take the form of acoustic or light waves, such as those generated during data transmissions by radio and infrared waves.
[0200] Various forms of media can be involved in carrying one or more strings of one or more instructions for execution by the 604 processor. For example, the instructions can initially be carried on a magnetic disk or a solid state drive on a computer. remote. The remote computer can load the instructions into its dynamic memory and send the instructions over a phone line using a modem. A local modem for computer system 600 can receive data on the phone line and use an infrared transmitter to convert the data into an infrared signal. An infrared detector can receive the data transmitted in the infrared signal and a suitable circuitry can place the data on bus 602. Bus 602 transmits the data to main memory 606, from which processor 604 retrieves and executes instructions. The instructions received by mePetição 870170063697, of 08/29/2017, p. 75/108
66/67 main memory 606 can optionally be stored in storage device 610 before or after execution by processor 604.
[0201] Computer system 600 also includes a communication interface 618 coupled to bus 602. Communication interface 618 provides a bidirectional data communication coupling to a network link 620 that is connected to a local area network 622. For example , the 618 communication interface can be an integrated services digital network card (ISDN), a cable modem, a satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, the communication interface 618 can be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links can also be implemented. In any such implementation, the communication interface 618 sends and receives electrical, electromagnetic or optical signals that transmit digital data streams that represent different types of information.
[0202] Network link 620 typically provides data communication across one or more networks to other data devices. For example, network link 620 may provide a connection via LAN 622 to a host computer 624, or to data equipment operated by an Internet service provider (ISP) 626. ISP 626, in turn, provides data communication services over the worldwide packet data communication network, commonly called Internet 628. Both local network 622 and Internet 628 use electrical, electromagnetic or optical signals that transmit digital data streams. The signals through the various networks and the signals on the network link 620 and through the communication interface 618, which transmit the digital data to the computer system 600 and from there 600, are exemplary forms of transmission media.
[0203] The computer system 600 can send messages and receive from Petition 870170063697, of 29/08/2017, p. 76/108
67/67 dos, including a program code, via network (s), a network link 620 and a communication interface 618. In the Internet example, a 630 server can transmit a requested code 630 to an application over the Internet 628, ISP 626, LAN 622 and communication interface 618.
[0204] The received code can be executed by processor 604 as received, and / or stored in storage device 610, or other non-volatile storage for later execution.
9. Equivalents, extensions, alternatives and miscellaneous [0205] In the specification mentioned above, the exemplary modalities of the present invention have been described with reference to numerous specific details that may vary from implementation to implementation. Therefore, the unique and exclusive indicator of what the invention is, and is intended by the applicant to be the invention, is the set of claims that result from this application, in the specific form in which such claims result, including any subsequent correction. In any definitions explicitly described here for terms contained in such claims, the meaning of such terms as used in the claims shall prevail. Accordingly, any limitation, element, property, characteristic, advantage or attribute that is not expressly stated in a claim should limit the scope of that claim in any way. The specification and the drawings are therefore considered to be illustrative rather than restrictive.
Petition 870170063697, of 8/29/2017, p. 77/108
权利要求:
Claims (20)
[1]
1. Method, CHARACTERIZED by understanding:
calculate one or more statistical values based on one or more source images in a sequence of source images;
select, based on one or more statistical values, an adaptive remodeling function for the one or more source images, the adaptive remodeling function by mapping source code words to remodeled code words;
adaptively reshaping, based, at least in part, on the selected adaptive reshaping function, a portion of source video content to generate a portion of refurbished video content, the portion of source video content being represented by a or more source images;
generate an approximation of an inverse of the selected adaptive remodeling function, which comprises:
determine a target lookup table (LUT) comprising key-value pairs representing the inverse of the selected adaptive remodeling function;
generate a first approximation of the target LUT by performing a direct search from small keys to large keys in the key-value pairs of the LUT;
generate a second approximation of the target LUT by performing an inverted search from large keys to small keys in the key-value pairs of the LUT; and select one of the first approximation and the second approximation by comparing approximation errors respectively generated by the direct search and the inverted search;
Petition 870190020308, of 02/27/2019, p. 59/64
[2]
2/6 encode the remodeled video content and a set of adaptive remodeling parameters that define the approximation of the inverse of the selected adaptive remodeling function to a remodeled video signal.
in which the inverse approximation of the selected adaptive remodeling function is represented by a set of second order polynomials, which the method further comprises:
determine a continuity condition to approximate the target LUT;
based on the continuity condition, select a first stop rule for the direct search used to approach the target LUT, and a second stop rule for the inverted search used to approach the target LUT;
generate the first approximation based, at least in part, on the first stop rule; and generate the second approximation based, at least in part, on the second stop rule.
2. Method according to claim 1, CHARACTERIZED by the fact that the portion of the refurbished video content comprises one or more refurbished images.
[3]
3. Method, according to claim 1, CHARACTERIZED by the fact that one or more images form a scene.
[4]
4. Method, according to claim 1, CHARACTERIZED by the fact that the target LUT is a LUT with optimal inverted search generated by means of source code word values that are mapped to each code word value remodeled in a plurality of reshaped codeword values, which are used to reshape the content of the source video.
[5]
5. Method, according to claim 1, CHARACTERIZED by the fact that the one or more statistical values include at least one among a maximum value, a minimum value, an average value, a median value, an average value, or a
Petition 870190020308, of 02/27/2019, p. 60/64
3/6 standard deviation value, as determined based on the source code words in one or more source images.
[6]
6. Method, according to claim 1, CHARACTERIZED by the fact that at least one of the selected adaptive remodeling function or the inverse of the selected adaptive remodeling function comprises one or more among analytical functions, non-analytical functions, research tables (LUTs), sigmoid functions, power functions, or piecewise functions.
[7]
7. Method, according to claim 1, CHARACTERIZED by the fact that the coefficients for polynomials in the set of polynomials are determined based on minimizing differences between values given by the polynomials and values given in a target research table (LUT) that represents the inverse of the selected adaptive remodeling function.
[8]
8. Method, according to claim 1, CHARACTERIZED by the fact that it also comprises selecting a continuity condition to generate the set of polynomials based on a type of function determined for the inverse of the selected adaptive remodeling function.
[9]
9. Method, according to claim 1, CHARACTERIZED by the fact that the set of polynomials is determined dynamically while the one or more images of origin are processed for adaptive remodeling.
[10]
10. Method, according to claim 1, CHARACTERIZED by the fact that it also comprises classifying one or more images of origin as one among images that comprise soft light areas, images that comprise soft dark areas, or halftone images.
[11]
11. Method, according to claim 1, CHARACTERIZED by the fact that the adaptive remodeling function is approximated using two or more second order polynomials and calculating the coefficients m p of the p-th polynomial, and comprises:
Petition 870190020308, of 02/27/2019, p. 61/64
4/6 determine a first lookup table (LUT) beta based on a function of the remodeled values to pixel values in the sequence of source images;
determine a second alpha lookup table (LUT) based on a function of the original pixel values in a source image and the reshaped pixel values;
determine a Bp matrix based on the first LUT;
determining a vector a p based on the second LUT; and calculate the coefficients m p of the p-th polynomial as B ^ a ^.
[12]
12. Method, according to claim 11, CHARACTERIZED by the fact that, for an element / 3 [k, j] of the first LUT:
/ 3 [Zc, y] = 1, for k = Q <k <Max, j = Q, / 3 [k, j] = Σ ^ = ο s /, for 0 <k <Max, 1 <j <4 , where Max denotes the maximum pixel value of the reshaped pixels s z corresponding to the pixels v z of a source image in the source image sequence.
[13]
13. Method, according to claim 11, CHARACTERIZED by the fact that, for an element α [k, j] of the second LUT, for 0 <j <3, a [kj] = 2 ^ = 0 v f s / , for 0 <k <Max, where Max denotes the maximum pixel value of the remodeled pixels s z corresponding to the pixels v z of a source image in the source image sequence.
[14]
14. Method, according to claim 1, CHARACTERIZED by the fact that generating a LUT for an inverted remodeling function comprises:
generate a histogram of remodeled values based on the direct remodeling function;
generate a cumulative table, in which a record in the cumulative table comprises the sum of the original pixel values mapped to the same remodeled value; and
Petition 870190020308, of 02/27/2019, p. 62/64
5/6 generate the LUT for the inverted remodeling function based on the histogram of the remodeled values and the cumulative table.
[15]
15. Method, according to claim 1, CHARACTERIZED by the fact that the adaptive remodeling function is approximated with the use of two or more second order polynomials, and pivot points for the two or more polynomials are selected according to an iterative method.
[16]
16. Method, according to claim 15, CHARACTERIZED by the fact that the iterative method further comprises:
define an initial error threshold;
adjust segments of the adaptive remodeling function so that an adjustment error of each of the one or more polynomials with a corresponding segment in the adaptive remodeling function does not exceed the initial error threshold;
determine a minimum for all adjustment errors in all segments of the adaptive remodeling function; and repeat the adjustment process for a new error threshold, where the new error threshold is less than the minimum of all adjustment errors.
[17]
17. Method, according to claim 16, CHARACTERIZED by the fact that it still understands the iterative method when the minimum of all adjustment errors in the current iteration is equal, within a threshold, to the minimum of all adjustment errors in the previous iteration.
[18]
18. Method to reconstruct video using a processor, in a decoder, the CHARACTERIZED method for understanding:
retrieving refurbished video content and a set of adaptive remodeling parameters that define a set of second order polynomials that approximate an inverse of an adaptive remodeling function from a remodeled video signal, the inverse of the adaptive remodeling function mapping remodeled code words to recons source code wordsPetition 870190020308, 02/27/2019, pg. 63/64
6/6 truncated;
the refurbished video content being generated by an upstream device based, at least in part, on the adaptive remodeling function, wherein the adaptive remodeling function is selected according to the method defined in claim 1;
reverse map, based, at least in part, on the set of second order polynomials that approximates the reverse of the adaptive remodeling function, a portion of the remodeled video content to generate a portion of reconstructed video content;
generate, based, at least in part, on the portion of reconstructed video content, a sequence of reconstructed images, the sequence of reconstructed images representing a reconstructed version of a source image sequence used by the upstream device to generate the content of refurbished video.
[19]
19. Method, according to claim 18, CHARACTERIZED by the fact that it also includes rendering the plurality of reconstructed images in a display system.
[20]
20. Method, according to claim 18, CHARACTERIZED by the fact that it also includes:
establish, based, at least in part, on the set of adaptive remodeling parameters related to the approximation of the inverse of the adaptive remodeling function from the remodeled video signal, an approximation of a target lookup table (LUT) that represents the inverse of the adaptive remodeling function.
类似技术:
公开号 | 公开日 | 专利标题
BR112017018552B1|2019-10-22|approach to signal remodeling
US10701375B2|2020-06-30|Encoding and decoding reversible production-quality single-layer video signals
JP6182644B2|2017-08-16|Layer decomposition in hierarchical VDR coding
CN107771392B|2021-08-31|Real-time content adaptive perceptual quantizer for high dynamic range images
US20210092461A1|2021-03-25|Linear encoder for image/video processing
JP2018509067A|2018-03-29|Nearly visual lossless video recompression
CN108141599B|2022-01-18|Preserving texture/noise consistency in video codecs
EP3900341A1|2021-10-27|Machine learning based dynamic composing in enhanced standard dynamic range video |
WO2021223540A1|2021-11-11|Processing method, encoding device, and decoding device for high dynamic range | video
WO2021262599A1|2021-12-30|Image prediction for hdr imaging in open-loop codecs
CN114175647A|2022-03-11|Electro-optical transfer function conversion and signal legalization
JP2017506443A|2017-03-02|Image processing method for maintaining small color / gray difference
同族专利:
公开号 | 公开日
JP2018511223A|2018-04-19|
US10080026B2|2018-09-18|
WO2016153896A1|2016-09-29|
RU2666234C1|2018-09-06|
US20180020224A1|2018-01-18|
EP3272123B1|2019-06-19|
KR101844732B1|2018-04-02|
EP3272123A1|2018-01-24|
KR20170118207A|2017-10-24|
JP6374614B2|2018-08-15|
BR112017018552A2|2018-04-24|
CN107409221A|2017-11-28|
CN107409221B|2019-06-14|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

RU2009620C1|1991-06-18|1994-03-15|Плешивцев Василий Алексеевич|Tv signal adaptive restoring device|
US5612900A|1995-05-08|1997-03-18|Kabushiki Kaisha Toshiba|Video encoding method and system which encodes using a rate-quantizer model|
US7023580B2|2001-04-20|2006-04-04|Agilent Technologies, Inc.|System and method for digital image tone mapping using an adaptive sigmoidal function based on perceptual preference guidelines|
ES2551561T3|2006-01-23|2015-11-19|MAX-PLANCK-Gesellschaft zur Förderung der Wissenschaften e.V.|High dynamic range codecs|
DK2279622T3|2008-04-16|2015-01-12|Fraunhofer Ges Zur Förderung Der Angewandten Forschung E V|Bit depth scalability|
WO2010105036A1|2009-03-13|2010-09-16|Dolby Laboratories Licensing Corporation|Layered compression of high dynamic range, visual dynamic range, and wide color gamut video|
TWI479898B|2010-08-25|2015-04-01|Dolby Lab Licensing Corp|Extending image dynamic range|
WO2012147018A2|2011-04-28|2012-11-01|Koninklijke Philips Electronics N.V.|Apparatuses and methods for hdr image encoding and decoding|
MX2014003554A|2011-09-27|2014-06-05|Koninkl Philips Nv|Apparatus and method for dynamic range transforming of images.|
TWI575933B|2011-11-04|2017-03-21|杜比實驗室特許公司|Layer decomposition in hierarchical vdr coding|
ES2666899T3|2013-03-26|2018-05-08|Dolby Laboratories Licensing Corporation|Perceptually-quantized video content encoding in multilayer VDR encoding|
CN105324997B|2013-06-17|2018-06-29|杜比实验室特许公司|For enhancing the adaptive shaping of the hierarchical coding of dynamic range signal|
EP3198556B1|2014-09-26|2018-05-16|Dolby Laboratories Licensing Corp.|Encoding and decoding perceptually-quantized video content|
CN107409213B|2015-03-02|2020-10-30|杜比实验室特许公司|Content adaptive perceptual quantizer for high dynamic range images|EP3314893A1|2015-06-30|2018-05-02|Dolby Laboratories Licensing Corporation|Real-time content-adaptive perceptual quantizer for high dynamic range images|
TW201717627A|2015-07-28|2017-05-16|Vid衡器股份有限公司|High dynamic range video coding architectures with multiple operating modes|
EP3354021A1|2015-09-23|2018-08-01|Dolby Laboratories Licensing Corporation|Preserving texture/noise consistency in video codecs|
EP3151562B1|2015-09-29|2020-06-17|Dolby Laboratories Licensing Corporation|Feature based bitrate allocation in non-backward compatible multi-layer codec via machine learning|
US10311558B2|2015-11-16|2019-06-04|Dolby Laboratories Licensing Corporation|Efficient image processing on content-adaptive PQ signal domain|
US10223774B2|2016-02-02|2019-03-05|Dolby Laboratories Licensing Corporation|Single-pass and multi-pass-based polynomial approximations for reshaping functions|
US10701375B2|2016-03-23|2020-06-30|Dolby Laboratories Licensing Corporation|Encoding and decoding reversible production-quality single-layer video signals|
JP6937781B2|2016-05-04|2021-09-22|インターデジタル ヴイシー ホールディングス, インコーポレイテッド|Methods and equipment for coding / decoding high dynamic range images into coded bitstreams|
US10264287B2|2016-10-05|2019-04-16|Dolby Laboratories Licensing Corporation|Inverse luma/chroma mappings with histogram transfer and approximation|
EP3306563B1|2016-10-05|2022-01-12|Dolby Laboratories Licensing Corporation|Inverse luma/chroma mappings with histogram transfer and approximation|
KR20210118231A|2016-10-05|2021-09-29|돌비 레버러토리즈 라이쎈싱 코오포레이션|Source color volume information messaging|
EP3659339B1|2017-07-24|2021-09-01|Dolby Laboratories Licensing Corporation|Single-channel inverse mapping for image/video processing|
WO2019023202A1|2017-07-24|2019-01-31|Dolby Laboratories Licensing Corporation|Single-channel inverse mapping for image/video processing|
US10609372B2|2017-09-29|2020-03-31|Dolby Laboratories Licensing Corporation|Up-conversion to content adaptive perceptual quantization video signals|
US10609424B2|2018-03-09|2020-03-31|Dolby Laboratories Licensing Corporation|Single-layer progressive coding for supporting multi-capability HDR composition|
EP3791577A1|2018-05-11|2021-03-17|Dolby Laboratories Licensing Corporation|High-fidelity full reference and high-efficiency reduced reference encoding in end-to-end single-layer backward compatible encoding pipeline|
RU2691082C1|2018-09-18|2019-06-10|федеральное государственное бюджетное образовательное учреждение высшего образования "Донской государственный технический университет" |Method for tonal approximation of a palette of a monochrome halftone image|
KR20210046086A|2018-09-19|2021-04-27|돌비 레버러토리즈 라이쎈싱 코오포레이션|Automatic display management metadata generation for gaming and/or SDR+ content|
RU2703933C1|2018-11-08|2019-10-22|федеральное государственное бюджетное образовательное учреждение высшего образования "Кемеровский государственный университет" |Method of identifying multi-sinusoidal digital signals|
EP3900341A1|2018-12-18|2021-10-27|Dolby Laboratories Licensing Corporation|Machine learning based dynamic composing in enhanced standard dynamic range video |
WO2020180120A1|2019-03-05|2020-09-10|엘지전자 주식회사|Method and device for image coding based on lmcs|
WO2020184928A1|2019-03-11|2020-09-17|엘지전자 주식회사|Luma mapping- and chroma scaling-based video or image coding|
WO2020197207A1|2019-03-23|2020-10-01|엘지전자 주식회사|Filtering-based video or image coding comprising mapping|
EP3734588A1|2019-04-30|2020-11-04|Dolby Laboratories Licensing Corp.|Color appearance preservation in video codecs|
WO2021030506A1|2019-08-15|2021-02-18|Dolby Laboratories Licensing Corporation|Efficient user-defined sdr-to-hdr conversion with model templates|
TW202116072A|2019-10-01|2021-04-16|美商杜拜研究特許公司|Tensor-product b-spline predictor|
WO2021076822A1|2019-10-17|2021-04-22|Dolby Laboratories Licensing Corporation|Adjustable trade-off between quality and computation complexity in video codecs|
WO2021113549A1|2019-12-06|2021-06-10|Dolby Laboratories Licensing Corporation|Cascade prediction|
WO2021168001A1|2020-02-19|2021-08-26|Dolby Laboratories Licensing Corporation|Joint forward and backward neural network optimization in image processing|
CN110996131B|2020-03-02|2020-11-10|腾讯科技(深圳)有限公司|Video encoding method, video encoding device, computer equipment and storage medium|
WO2021216767A1|2020-04-22|2021-10-28|Dolby Laboratories Licensing Corporation|Iterative optimization of reshaping functions in single-layer hdr image codec|
US20210337163A1|2020-04-22|2021-10-28|Grass Valley Limited|System and method for image format conversion using 3d lookup table approximation|
WO2022032010A1|2020-08-06|2022-02-10|Dolby Laboratories Licensing Corporation|Adaptive streaming with false contouring alleviation|
法律状态:
2019-10-08| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|
2019-10-22| B16A| Patent or certificate of addition of invention granted|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 17/03/2016, OBSERVADAS AS CONDICOES LEGAIS. |
优先权:
申请号 | 申请日 | 专利标题
US201562136402P| true| 2015-03-20|2015-03-20|
US201562199391P| true| 2015-07-31|2015-07-31|
PCT/US2016/022772|WO2016153896A1|2015-03-20|2016-03-17|Signal reshaping approximation|
[返回顶部]